Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 - PowerPoint PPT Presentation

Wo Word2Visua d2VisualVec lVec fo for Video deo-To To-Text Text Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 , Xiaoxu Wang 2 , Qijie Wei 2 , Weiyu Lan 2 , Cees G. M. Snoek 3 Zhejiang University 1 Renmin University of China 2 University of Amsterdam 3

Our idea Project sentences into a video feature space Match sentences and videos in this space

Solution: Word2VisualVec Transform text into a video feature vector Φ (x) s(q) h 1 (q) word matrix pooling σ (W 1 *s(q)+b 1 ) σ (W 2 *h 1 (q)+b 2 ) CNN video Text J. Dong, X. Li, C. Snoek, Word2VisualVec: Cross-Media Retrieval by Visual Feature Prediction, Arxiv:1604.06838, 2016

Word2VisualVec Transform text into a video feature vector Φ (x) s(q) h 1 (q) word matrix pooling σ (W 1 *s(q)+b 1 ) σ (W 2 *h 1 (q)+b 2 ) CNN video Text word2vec

Word2VisualVec Transform text into a video feature vector Φ (x) s(q) h 1 (q) word matrix pooling σ (W 1 *s(q)+b 1 ) σ (W 2 *h 1 (q)+b 2 ) CNN video Text word2vec + Multi-layer perceptron Minimize Mean Squared Error between text vector and video vector

Implementation Two video features - Visual: Mean pooling over frame-level CNN feature extracted by GoogleNet-shuffle [Mettes et al ICMR16] - Visual + Audio: GoogleNet-shuffle + Bag of quantized MFCC Word2Vec - 500-dim, trained on user tags of 30m Flickr images Word2VisualVec architecture - For predicting the visual feature: 500-1000-1024 - For predicting the visual + audio feature: 500-1000-2048 Training set - MSR-VTT training set of 6,513 videos [Xu et al. CVPR16] Validation set - TRECVID 200 training videos

Video-to-text results Word2VisualVec is effective set A set B Adding the audio feature provides some improvement

Video-to-text results Text → Visual a man with a beard is wearing glasses Text → Visual + Audio man talks into the camera Text → Visual soccer players are blocking the ball on a soccer field Text → Visual + Audio a soccer player scores a goal on a soccer field More results at http://lixirong.net/demo/vtt/tv16.html

Video Description Generation J. Dong, X. Li, W. Lan, Y. Huo, C. Snoek, Early embedding and late reranking for video captioning , ACM Multimedia 2016

Idea: Re-use Video Tags for Captioning Predicted tags Generated caption track race a group of people are running in a field race track woman soccer player a soccer player is playing a goal on a game soccer field playing dance people people are dancing on a stage woman dancing

Our solution Google’s model for sentence generation Google’s model [Vinyals et al. CVPR 2015] GoogleNet-shuffle models are walking down the runway models are walking on the runway a woman is walking down the runway a woman is dancing … models are walking in a fashion show models are walking on the ramp

Our solution Better initialization by tag embedding Re-encoding by Word2VisualVec fashion Google’s model walking [Vinyals et al. CVPR 2015] model models are walking down the runway models are walking on the runway a woman is walking down the runway a woman is dancing … models are walking in a fashion show models are walking on the ramp

Our solution Rerank sentences by matching with video tags Re-encoding by Word2VisualVec fashion Google’s model walking [Vinyals et al. CVPR 2015] model models are walking down the runway models are walking on the runway Maximize tag matches a woman is walking down the runway models are walking in a a woman is dancing fashion show … models are walking in a fashion show models are walking on the ramp

Heuristics to add ‘where’ Two simple rules to append ‘where’ description to the end of the generated sentences: Add “ on a $sport_name field ” if $sport appear in the 1. sentence, such as basketball, baseball, and football. Add “ on a stage ” if “sing” or “dance” appear in the 2. sentence.

Description generation results Adding “where” improve the performance

Live demo http://lixirong.net/demo/vtt accept video file less than 10 MB

Conclusion Word2VisualVec for video-to-text matching in video space Early embedding and late reranking improves LSTM based video captioning Winning results in the VTT task Xirong Li

Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 - PowerPoint PPT Presentation

Wo Word2Visua d2VisualVec lVec fo for Video deo-To To-Text Text Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 , Xiaoxu Wang 2 , Qijie Wei 2 , Weiyu Lan 2 , Cees G. M. Snoek 3 Zhejiang University 1 Renmin University of

Image S(tching Ali Farhadi CSE 576 Several

ENTERPRISE DATA WAREHOUSE ENCRYPTION PLAN & CHALLENGES Terry Rankin Jay Irwin Terry Rankin

Ch u nking Arra y s in Dask PAR AL L E L P R OG R AMMIN G W ITH DASK IN P YTH ON Dha v ide

Conference March 14-18, 2016 Rankin Inlet, NU 1 Who is Atuqtuarvik Corporation ? Private,

B a nking Co., L td . UBS Warburg Financial Institutions Conference Sep. 6, 2001

Und nder erstanding Cen Central Ba Bank nking g in n Light ht of of t the he Cr Cred

THINK NKING ING TRAP APS S ON N THE E PSYCHOL CHOLOG OGY OF HEURI EURISTICS TICS AND ND

Geom ometry, Art, a and C Cultural R Relevance with C h Comput putationa nal Think nking

Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc tching hing in in Gradoo adoop

Energy Cooperation : Ret ethinkin nking g Op Opport portuni unitie ties s and d New

B a nking Co., L td . Information Meeting on Financial Results for FY2001

B.C. History of Nursing Group REGISTERED NURSES ASSOCIATION OF B.C. VANCOUVER 2004 Elizabeth

B a nking Co., L td . Information Meeting on Financial Results for First Half of FY2001

Deep Reinforcement Le Learning for Me Menti tion on-Ra Rank nking ng Cor Coreference Mod

Thi hinking nking Sus ustainability tainability Land Development Engineering Workshop

B a nking Co., L td . Merrill Lynch Japanese Banks & Financial Services Conference Oct. 18,

Match.com Leonard Hock, DO, MACOI, CMD, HMDC, FAAHPM Match.com Profile Gender Age

Wa Water ter-ba base sed Ran d Rankin kine-Cyc Cycle Was le Waste te Hea Heat R t Rec

Siouan Tribes of the Ohio Valley: Where did all those Indians come from? Robert L. Rankin

CWCB Approved Funding $450,000 in WSRA Funding $450,000 match: CCD & UDFCD match The

Blastns seed length Recall: blastns seed match is of length w = 11 , 12 exact match

When Match Fields Do Not Need to Match: Buffered Packet Hijacking in SDN Jiahao Cao, Renjie Xie,

A Pillar of the Student Affairs Profession Shannon Beaver CSA 501 Dr. Rankin The Pennsylvania

Including Intersectional Identities Sam Rankin Intersectional Equalities Coordinator Grounds

Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 - PowerPoint PPT Presentation

Wo Word2Visua d2VisualVec lVec fo for Video deo-To To-Text Text Match tching ing and d Rankin nking Jianfeng Dong 1 , Xirong Li 2 , Xiaoxu Wang 2 , Qijie Wei 2 , Weiyu Lan 2 , Cees G. M. Snoek 3 Zhejiang University 1 Renmin University of

Image S(tching Ali Farhadi CSE 576 Several

ENTERPRISE DATA WAREHOUSE ENCRYPTION PLAN &amp; CHALLENGES Terry Rankin Jay Irwin Terry Rankin

Ch u nking Arra y s in Dask PAR AL L E L P R OG R AMMIN G W ITH DASK IN P YTH ON Dha v ide

Conference March 14-18, 2016 Rankin Inlet, NU 1 Who is Atuqtuarvik Corporation ? Private,

B a nking Co., L td . UBS Warburg Financial Institutions Conference Sep. 6, 2001

Und nder erstanding Cen Central Ba Bank nking g in n Light ht of of t the he Cr Cred

THINK NKING ING TRAP APS S ON N THE E PSYCHOL CHOLOG OGY OF HEURI EURISTICS TICS AND ND

Geom ometry, Art, a and C Cultural R Relevance with C h Comput putationa nal Think nking

Cy Cypher pher-based based Graph ph Pattern ttern Ma Matc tching hing in in Gradoo adoop

Energy Cooperation : Ret ethinkin nking g Op Opport portuni unitie ties s and d New

B a nking Co., L td . Information Meeting on Financial Results for FY2001

B.C. History of Nursing Group REGISTERED NURSES ASSOCIATION OF B.C. VANCOUVER 2004 Elizabeth

B a nking Co., L td . Information Meeting on Financial Results for First Half of FY2001

Deep Reinforcement Le Learning for Me Menti tion on-Ra Rank nking ng Cor Coreference Mod

Thi hinking nking Sus ustainability tainability Land Development Engineering Workshop

B a nking Co., L td . Merrill Lynch Japanese Banks &amp; Financial Services Conference Oct. 18,

Match.com Leonard Hock, DO, MACOI, CMD, HMDC, FAAHPM Match.com Profile Gender Age

Wa Water ter-ba base sed Ran d Rankin kine-Cyc Cycle Was le Waste te Hea Heat R t Rec

Siouan Tribes of the Ohio Valley: Where did all those Indians come from? Robert L. Rankin

CWCB Approved Funding $450,000 in WSRA Funding $450,000 match: CCD &amp; UDFCD match The

Blastns seed length Recall: blastns seed match is of length w = 11 , 12 exact match

When Match Fields Do Not Need to Match: Buffered Packet Hijacking in SDN Jiahao Cao, Renjie Xie,

A Pillar of the Student Affairs Profession Shannon Beaver CSA 501 Dr. Rankin The Pennsylvania

Including Intersectional Identities Sam Rankin Intersectional Equalities Coordinator Grounds

ENTERPRISE DATA WAREHOUSE ENCRYPTION PLAN & CHALLENGES Terry Rankin Jay Irwin Terry Rankin

B a nking Co., L td . Merrill Lynch Japanese Banks & Financial Services Conference Oct. 18,

CWCB Approved Funding $450,000 in WSRA Funding $450,000 match: CCD & UDFCD match The