Scalable Understanding of Multilingual Media Steve Renals University of Edinburgh Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu
Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu
SUMMA in a nutshell • Significantly improve media monitoring , by the automatic ● analysis of media streams across many languages ● aggregation and distillation of stream content ● construction of knowledge bases from reported facts ● supply of media data visualisations at scale http://summa-project.eu
BBC Monitoring 300 journalists each monitoring up to 4 TV channels several online text sources 30 languages – most important include Russian Arabic Farsi http://summa-project.eu
Big Data • 250 video channels ● 2.5Tb/day, 19Tb/week, 1Pb/year • BBC monitoring has access to ● 1,500 TV channels ● 1,350 radio sources • But… ~700 free-to-air Arabic satellite channels, increases at ~100/year • Current monitoring processes are largely manual and cannot keep up with the scale of the task http://summa-project.eu
Use cases 1. External Media Monitoring identify emerging trends ● tracking people in the news ● monitoring the evolution of storylines ● 2. Internal Media Montoring manage multilingual content creation ● efficient reuse of content across languages ● 3. Data Journalism use SUMMA platform for data driven journalism ● http://summa-project.eu
SUMMA Prototypes Semantic Tag word cloud- UI Concept 1 Channel ID & size indicates current native frequency across region/ language group “Now playing” Segment text Unique highlighted timestamp Translated Player (Sd? transcript HD?) Player Tags shown controller - underlined tag instances marked Add new tag - click pencil to ‘underline’, and enter text Tools - screen grab, snip video, save, Segment machine analysis attach confidence (possibly better represented graphically?) http://summa-project.eu
SUMMA Prototypes http://summa-project.eu
SUMMA Prototypes http://summa-project.eu
SUMMA Prototypes http://summa-project.eu
Platform & Technologies Visualisation & prototypes Ingest audio, video, text Identify entities & relations Summarisation & distillation Speech recognition Sentiment detection Machine translation Segmentation & clustering http://summa-project.eu
Multilingual technologies http://summa-project.eu
SUMMA system v0.1 http://summa-project.eu
Funded by the EU H2020 ICT Programme under Grant Agreement 688139 http://summa-project.eu
Recommend
More recommend