Computer vision technologies for visual knowledge enrichment - PowerPoint PPT Presentation

Computer vision technologies for visual knowledge enrichment Miriam Redi, Research Scientist

Images are powerful tools for communication and knowledge sharing

Wikimedia spaces are still missing many images! Around 1/10 articles in 95% of items in

One source of Visual Knowledge

Computer Vision

How do we bridge the visual knowledge gap? Computer vision can help contributors ● Organize images ○ Describe images ○ Search and Find images ○

How do we fill the gap responsibly? _ Knowledge equity : As a social movement, we will focus_ _our efforts on the knowledge and communities that have _ _ been left out by structures of power and privilege . We will_ _welcome people from every background to build strong and _ _diverse communities. We will break down the social, political, _ _ and technical barriers preventing people from accessing and_ _contributing to free knowledge._

RESPONSIBLE * Computer Vision * unbiased (against stereotypes) representative (multicultural) empowering (vs exclusive) 9

Bias: learning from Unbiased Machine Vision “human” data 10

VERY Large

Biased Machine Vision Object detection 12

Unbiased Machine Vision Interpretable Vision Algorithms Automatically Checking for Stereotypes in Datasets Miriam Redi, Nikhil Rasiwasia, Alejandro Jaimes, Gaurav Aggarwal The Beauty of Capturing Faces, Rating the Quality of Digital Portraits Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia 13

Unbiased Machine Vision Computational Portrait Aesthetics Computational Aesthetics Detecting beautiful images Computational Portrait Aesthetics Detecting beautiful images of faces ( NOT beautiful faces) 14

Unbiased Machine Vision Computational Portrait Aesthetics Data Hundreds of thousands of photographs annotated in terms of quality by photographers Interpretable Visual Feature extraction Inspired by Portrait Photography ● Expanded with demographics ● EYE SHARPNESS features A 15

Unbiased Machine Vision Computational Portrait Aesthetics Correlation between quality scores and individual features -- The dataset is NOT 16 Biased! :)

Unbiased Machine Vision How do we define bias? How can we identify and operationalize such bias dimensions? 17

Representation: learning from unevenly distributed data

Underrepresentative Machine Vision Culture biases 19

Representative Machine Vision Some solutions

Representative Machine Vision Multicultural Machine Vision Tools Reflecting visual definitions and preferences of people around the world 21

Representative Machine Vision We have sentiment detectors for images… Reflecting sentiment perception of small groups of (western) people Can we make image sentiment classifiers multicultural? Brendan Jou, Tao Chen, Nikolaos Pappas, Miriam Redi, Mercan Topkara, Shih-Fu Chang Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology ACM Multimedia 2015, Brisbane, Australia 22 DATA, CODE, DEMO: mvso.columbia.edu

Representative Machine Vision Multilingual Visual Sentiment Ontology 7.6M IMAGES 12 LANGUAGES SENTIMENT LANGUAGE-SPECIFIC 16K ANPs ADJECTIVE-NOUN FLICKR ANNOTATIONS CRAWLING PAIR (ANP) Through DISCOVERY CROWDSOURCING EMOTION KEYWORDS healthy breakfast, health coffee, ... [Plutchik 1980] 12 LANGUAGES Brendan Jou, Tao Chen, Nikolaos Pappas, Miriam Redi, Mercan Topkara, Shih-Fu Chang Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology ACM Multimedia 2015, Brisbane, Australia 23

Representative Machine Vision Multilingual Visual Sentiment Ontology Cultural insights based ● on semantically related concepts Each cluster reveals ● Wording variation ○ Sentiment variation ○ Visual content ○ variation 24

Representative Machine Vision Multilingual Visual Sentiment Ontology Cross-Lingual Sentiment Prediction ES FR ZH DE EN IT LANGUAGE-SPECIFIC SENTIMENT PREDICTORS CROSS-LINGUAL SENTIMENT PREDICTION 25

Representative Machine Vision Multilingual Visual Sentiment Ontology Cross-Lingual Sentiment Prediction Comparing different models to understand similarities and differences between language communities. 26

Representative Machine Vision What is really “representative”? What is the tradeoff between complexity and representativeness? 27

Empowerment: the algorithm in the human loop 28 Photo by Franck V. on Unsplash

Empowering Machine Vision Finding Representative Images for Human Knowledge In a collaborative environment. 29

Empowering Machine Vision Wikidata Visual Thinking Wikidata is an international and thus multilingual project. While English is the default interface language, the project is intended to be used by, and useful for, users of every language with MediaWiki internationalization support.

Empowering Machine Vision Missing images in Wikidata PEOPLE ALL SPECIES

Empowering Machine Vision Typical Scenario: u sers willing to add images to Wikidata might need to manually search for the right image using different tools from millions of free-licensed images Manual Selection and Evaluation Item without P18 - ‘Has image’ Manual search through millions of free licensed images

Empowering Machine Vision Solving the problem of visual enrichment in a collaborative fashion, leveraging the wisdom of the community Item without P18 - ‘Has image’ Manual Selection and Discovering related Ranking images according Evaluation images from different to Relevance and Quality, open sources automatically inferred from community curation

Empowering Machine Vision Automatically discovering RELEVANT Free-Licensed images from linked pages, Commons search, Flickr search

So much user-generated visual information.. How to prioritize it? 35

Empowering Machine Vision QUALITY: is the image of high photographic quality? Not all relevant images are actually ‘good’ images Photo: Jee & Rani Nature Photography on Commons Photo: Vinayaraj on Commons High Quality Lower Quality

Modeling the wisdom of the community To surface high-quality pictures, we train a model using data curated by the Wikimedia community High Quality: Lower Quality: 160K 160K Quality Commons Random Commons

Empowering Machine Vision FRAMEWORK: Convolutional Neural Network Google Inception-v3[1] High Low Drawing: Aphex34 on Commons [1] Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . 2016.

Empowering Machine Vision Example results

Empowering Machine Vision Community tools for disseminating image recommendations https://tools.wmflabs.org/wikidata-game/distributed/#game=49&opt=%7B%22type%22%3A%22flower%22%7D

Representative Machine Vision Is this really empowering? How do we measure disruptions introduced by algorithms? 41

Summary Unbiased: Interpretable models to detect data stereotypes ● Representative: Culture-specific models that reflect different ● visual worlds Empowering: Helping communities with visual knowledge ● enrichment

Thank you!

Challenges Bias: How to identify bias dimensions? ● How do we evaluate bias with the same communities? ○ Representation: How to make really representative, universal models? What does ● that mean? Representation vs quality; representative image search results ○ Empowerment: How to use machine learning technologies in harmony with ● democratic, collaborative processes? More! Are we missing something? ● Privacy of people in pictures ○ Data re-use, what are people doing with images? ○

Computer vision technologies for visual knowledge enrichment - PowerPoint PPT Presentation

Computer vision technologies for visual knowledge enrichment Miriam Redi, Research Scientist Images are powerful tools for communication and knowledge sharing Wikimedia spaces are still missing many images! Around 1/10 articles in 95% of

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Knowledge-Based Reasoning in Computer Vision CSC 2539 Paul Vicol Outline Knowledge Bases

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Studying the visual system (1) Early Vision and The visual system can be (and is) studied using

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

Introduction to visual computation and the primate visual system Problems in vision

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

Alpha Presentation Computer Vision for Furniture Manufacturing The Capstone Experience Team

Thomas Wood Computer vision and data science consultant www.fastdatascience.com Past projects

Horizontal Water Source Heat Pump Horizontal Water Source Heat Pump Product Training Product

Hydrate Occurrence in Centrifugal Compressor Systems Orlando Donda Filho, Senior Engineer,

Communication Networks and Computer Vision Based Control Nicholas Tovar Nicholas Tovar Ventura

Low-power motion detection and building control using Computer Vision By: Kwabena Agyeman

OUTAGE MANAGEMENT PROCESS REDESIGN July 27, 2016 Agenda Progress Update Overview of

Effectiveness of Deep Learning Vs. Machine Learning in a Health Care Use Case RxToDx A Data

Computer vision technologies for visual knowledge enrichment - PowerPoint PPT Presentation

Computer vision technologies for visual knowledge enrichment Miriam Redi, Research Scientist Images are powerful tools for communication and knowledge sharing Wikimedia spaces are still missing many images! Around 1/10 articles in 95% of

Computer Vision Computer Vision How does vision work? What is vision for? Ela Claridge

Knowledge-Based Agents knowledge knowledge representation, knowledge base, types of knowledge

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

Knowledge-Based Reasoning in Computer Vision CSC 2539 Paul Vicol Outline Knowledge Bases

CS262: Computer Vision (and Human-Computer Interaction) John Magee 1 Computer Vision How are

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

Studying the visual system (1) Early Vision and The visual system can be (and is) studied using

Branding Presentation VISION Mevushal VISION Muscat of Alexandria &amp; Viognier VISION

Recap by Milo Davies, SAS NZ POWERFUL ADAPTIVE OPEN UNIFIED SAS Visual Analytics SAS Visual

Visual Analytics Visual Analytics is the science of analytical reasoning supported by interactive

Visual Perception human perception display devices 1 CS 349 - Visual Perception Reference

Introduction to visual computation and the primate visual system Problems in vision

Plan for today Knowledge-based systems 1 Explicit knowledge Knowledge Representation Inferred

Plan for today Knowledge-based systems 1 Tacit knowledge Knowledge Representation Inferred

Alpha Presentation Computer Vision for Furniture Manufacturing The Capstone Experience Team

Thomas Wood Computer vision and data science consultant www.fastdatascience.com Past projects

Horizontal Water Source Heat Pump Horizontal Water Source Heat Pump Product Training Product

Hydrate Occurrence in Centrifugal Compressor Systems Orlando Donda Filho, Senior Engineer,

Communication Networks and Computer Vision Based Control Nicholas Tovar Nicholas Tovar Ventura

Low-power motion detection and building control using Computer Vision By: Kwabena Agyeman

OUTAGE MANAGEMENT PROCESS REDESIGN July 27, 2016 Agenda Progress Update Overview of

Effectiveness of Deep Learning Vs. Machine Learning in a Health Care Use Case RxToDx A Data

Branding Presentation VISION Mevushal VISION Muscat of Alexandria & Viognier VISION