Q&A for Wikidata CS294S/W Project Pitch Silei Xu Wikidata.org - PowerPoint PPT Presentation

Sep 19, 2023 •336 likes •461 views

Q&A for Wikidata CS294S/W Project Pitch Silei Xu Wikidata.org A large open-domain knowledge base with 90 million items, 8K properties Q&A on Wikidata Dataset Size Publisher STOA Dataset Quality CSQA 1.6 Million AAAI 2018 0.71

Q&A for Wikidata CS294S/W Project Pitch Silei Xu
Wikidata.org A large open-domain knowledge base with 90 million items, 8K properties
Q&A on Wikidata Dataset Size Publisher STOA Dataset Quality CSQA 1.6 Million AAAI 2018 0.71 (F1) Train & evaluate on synthetic data LC-Quad 2.0 30K ISWC 2019 - Train & evaluate on paraphrase data KQA Pro 117K Arxiv 2020 35% Train & evaluate on paraphrase data Schema2QA 470K per domain CIKM 2020 70% Train on synthetic+paraphrase, evaluate on real questions
Current Status Homework: build a Q&A agent for one domain in Wikidata ● Can we extend this to a multi-domain Q&A agent over the entire Wikidata? ● Extract useful information to generate the manifest and parameter values needed for data synthesis ○ Generate synthetic dataset for all domains ○ Avoid conflicts ○
Challenges Scalability ● More than 80GB of data ○ Extract useful information to generate the manifest and parameter values needed for data synthesis ○ Generate synthetic dataset for all domains ○ Avoid conflicts ○ Representation ● ThingTalk: qualifiers, joins ○ Compositionality ● Impossible to train on all possible combinations, we need to generalize to unseen programs ○ Can we leverage other information such as types? ○
Roadmap 1. Download the wikidata dump and extract manifest (1~2 weeks) 2. Build a baseline semantic parser with current infrastructure (1~2 weeks) 3. Find out where it fails 4. Improve the quality of representation (manifest, ThingTalk) & synthetic data (3~4 weeks) 5. Beat the benchmarks and profit!
Auto-IoT Semantic Parser for IoTs CS294S/W Project Pitch Silei Xu
Recap: AutoQA Automatically generate Q&A agents from schema ● Learn how to ask questions using pre-trained language models ○ Synthesize large training set with 800 templates ○
Auto-IoT Automatically generate virtual assistants to control IoTs from IoT function signatures IoT function signatures Turn on/off the light action set_power(in req power: Enum(on,off)) Switch on/off the light Lights up! Lights out! ... We have function signatures for 20+ IoT devices in Thingpedia
Difference between Q&A and VA commands Generic verb phrases vs domain-specific verb phrases ● Most of Q&A tables can use generic verb phrase to query: “search”, “find”, “show”, “get”, etc. ○ IoTs have different verb phrases: “turn on/off”, “lower the temperature”, “open the garage door”, “change the ○ color to blue”, etc Personalization ● In Q&A, everyone queries the same database ○ For IoT devices, people may have different set of devices, and may name them differently. ○
Roadmap 1. Learn available commands for IoTs and analyze their sentence structure (~1 week) 2. Implement a similar algorithm as the one in AutoQA for Auto-IoT (~2 weeks) 3. Find out where it fails 4. Improve the algorithm & investigate new methodologies (3~4 weeks) 5. Get integrated with Almond + Home Assistant 6. Profit!

Recommend

Wikidata and Querying Wikidata with SPARQL Semantic Technologies 5.1 1 What are the ten

Wikidata and Querying Wikidata with SPARQL Semantic Technologies 5.1 1 What are the ten largest cities with a female mayor? Semantic Technologies 5.1 2 Where are people born who travel to space? (colour-coded by gender) Semantic

621 views • 13 slides

Lexemes in Wikidata Lexicographical data for everyone Lydia Pintscher @nightrose Wikidata in

Lexemes in Wikidata Lexicographical data for everyone Lydia Pintscher @nightrose Wikidata in numbers Editors (5 or more edits a month) > 12.500 Items 74.8 Million Properties 7164 Statements 931 Million Statements linking to ouher

465 views • 21 slides

Generating OpenMath Content Dictionaries from Wikidata Moritz Schubotz Information Science

Generating OpenMath Content Dictionaries from Wikidata Moritz Schubotz Information Science Group University of Konstanz www.isg.uni.kn Overview 1. Introduction 2. The MathML Benchmark MathMLBen 3. A Wikidata Content Dictonary 4. Wikidata

666 views • 27 slides

Multilinguality in Wikidata Lucie-Aime Kaffee kaffee@soton.ac.uk About Me PhD Student WAIS,

Multilinguality in Wikidata Lucie-Aime Kaffee kaffee@soton.ac.uk About Me PhD Student WAIS, University of Southampton Previously worked as a Software Developer at Wikimedia Deutschland, in the Wikidata team Interest in (under-resourced)

626 views • 61 slides

Wikidata in Wikipedia [[User:Mike Peel]] Wikimania 2016 With thanks to Lydia Pintscher, Rex

Wikidata in Wikipedia [[User:Mike Peel]] Wikimania 2016 With thanks to Lydia Pintscher, Rex Schneider and Liam Wyatt for feedback Why blind reviewing isnt always a good idea... Im a Wikidata user and Wikipedia editor. Ive done cool

228 views • 20 slides

Classification of Knowledge Organization Systems with Wikidata 10.5281/zenodo.61767 Jakob Vo

Classification of Knowledge Organization Systems with Wikidata 10.5281/zenodo.61767 Jakob Vo Verbundzentrale des GBV (VZG), G ottingen, Germany 15 th European NKOS Workshop, Hannover, 2016-09-09 KOS classification with Wikidata,

406 views • 38 slides

Introducing Wikidata to the Linked Data Web Fredo Erxleben , Michael Gnther , Markus Krtzsch

www.tu-dresden.de Introducing Wikidata to the Linked Data Web Fredo Erxleben , Michael Gnther , Markus Krtzsch , Julian Mendez and Denny Vrandei ISWC, 21th October 2014 About Wikidata Launched on 30th October 2012 15,817,444

224 views • 20 slides

Wikidata the free and open knowledge base Wikimedia DC - Sunlight Foundation Hackathon - April

Wikidata the free and open knowledge base Wikimedia DC - Sunlight Foundation Hackathon - April 2014 Katie Filbert - @filbertkm https://github.com/filbertkm/slides CAN HAZ DATA? Credits: Sasan Geranmehr (CC-BY 3.0) What is Wikidata?

604 views • 45 slides

CICM 2016, OpenMath workshop Displaying formulae in Wikimedias Central Data Storage Wikidata

CICM 2016, OpenMath workshop Displaying formulae in Wikimedias Central Data Storage Wikidata Moritz Schubotz with material from Katie Filbert 31/07/2016 www.formulasearchengine.com 1 What is Wikidata? repository of the world's

437 views • 21 slides

Vandalism Detection in Wikidata Stefan Heindorf 1 , Martin Potthast 2 , Benno Stein 2 , Gregor

Vandalism Detection in Wikidata Stefan Heindorf 1 , Martin Potthast 2 , Benno Stein 2 , Gregor Engels 1 CIKM 2016 October 25, 2016 1 2 Motivation Vandalism Detection in Wikidata Stefan Heindorf 2 Motivation Vandalism Detection in Wikidata

944 views • 67 slides

State of identifjers in Wikidata WikidataCon 2017, Berlin Jakob Vo Verbundzentrale des GBV

State of identifjers in Wikidata (2017-10-28) State of identifjers in Wikidata WikidataCon 2017, Berlin Jakob Vo Verbundzentrale des GBV (VZG) 2017-10-28 State of identifjers in Wikidata (2017-10-28) entity ids

360 views • 12 slides

Wikidata as authority linking hub Joachim Neubert (ZBW) Jakob Vo (VZG) Introduction

Wikidata as authority linking hub Joachim Neubert (ZBW) Jakob Vo (VZG) Introduction Authority files Consistently refer to entities Via identifier (things, not strings) GND, MeSH, STW, ISIL, RePEc-Authors Linking hubs Connect

732 views • 30 slides

Managing and Consuming Completeness Information for Wikidata Using COOL-WD KRDB Research Centre,

Managing and Consuming Completeness Information for Wikidata Using COOL-WD KRDB Research Centre, Free University of Bozen-Bolzano Radityo Eko Prasojo , Fariz Darari , Simon Razniewski, Werner Nutt COLD 2016 @ Kobe, Japan October 18, 2016

438 views • 19 slides

Wikidata-based multilingual library search Elena Gretillat Laurel Zuckerman Jacqueline

Wikidata-based multilingual library search Elena Gretillat Laurel Zuckerman Jacqueline Martinelli Nicolas Prongu Lionel Walter The situation SWITZERLAND Who are they ? Zentralbibliothek Zrich Schweizerischer Bundesrat

380 views • 27 slides

Discovery with Linked Open Data: Leveraging Wikidata for Context and Exploration Lucas Mak

Discovery with Linked Open Data: Leveraging Wikidata for Context and Exploration Lucas Mak Devin Higgins (me) @devinhhi Michigan State University Libraries Digital Repository: https://d.lib.msu.edu Main Idea Use linked data to provide

137 views • 13 slides

Populating Narratives Using Wikidata Events: An Initial Experiment Daniele Metilli, Valentina

Populating Narratives Using Wikidata Events: An Initial Experiment Daniele Metilli, Valentina Bartalesi, Carlo Meghini, Nicola Aloia IRCDL 2019, Pisa Narratives in Digital Libraries Our research aims to introduce narratives into Digital

611 views • 14 slides

OD2WD: From Open Data to Wikidata through Patterns Muhammad Faiz, Gibran M.F. Wisesa, Adila

OD2WD: From Open Data to Wikidata through Patterns Muhammad Faiz, Gibran M.F. Wisesa, Adila Krisnadhi , and Fariz Darari Faculty of Computer Science, Universitas Indonesia, Depok, Indonesia Outline aaaa Motivation The OD2WD system

275 views • 25 slides

WDPlus: Leveraging Wikidata to Link and Extend Tabular Data Daniel Garijo , Pedro Szekely

WDPlus: Leveraging Wikidata to Link and Extend Tabular Data Daniel Garijo , Pedro Szekely Information Sciences Institute and Department of Computer Science @dgarijov dgarijo@isi.edu Abundance of data sources in the Web Users of data face

601 views • 13 slides

Querying Linked Data with SPARQL and the Wikidata Query Service Lucas Werkmeister 2019-12-27

Querying Linked Data with SPARQL and the Wikidata Query Service Lucas Werkmeister 2019-12-27 Lucas Werkmeister https://tinyurl.com/36c3-wdqs 1/20 An example graph happens in is next to Esszimmer is next to Kche is part of happens in

561 views • 41 slides

NECKAr: A Named Entity Classifier for Wikidata Johanna Gei, Andreas Spitz, Michael Gertz

NECKAr: A Named Entity Classifier for Wikidata Johanna Gei, Andreas Spitz, Michael Gertz Heidelberg University, Institute of Computer Science Database Systems Research Group { geiss,spitz,gertz } @informatik.uni-heidelberg.de GSCL Berlin,

515 views • 27 slides

Documenting and preserving programming languages and software in Wikidata John Samuel, Katherine

Documenting and preserving programming languages and software in Wikidata John Samuel, Katherine Thornton, Kenneth Seals-Nutt CPE Lyon, EaaSI SWIB 2018, Bonn, 27 th November, 2018 Digital Preservation | John Samuel, Katherine Thornton 1 | >

983 views • 59 slides

Open your structured data with Wikibase install your own instance of the technology behind

Open your structured data with Wikibase install your own instance of the technology behind Wikidata Jens Ohlig <jens.ohlig@wikimedia.de> Wikidata: the basics A knowledge base Part of the Wikimedia projects Structured

358 views • 33 slides

Pathways for Discovery of Free Sofware Katherine Thornton, Morane Gruenpeter Wikidata for Digital

Pathways for Discovery of Free Sofware Katherine Thornton, Morane Gruenpeter Wikidata for Digital Preservation katherine.thornton@yale.edu, morane@sofwareheritage.org 25 March 2018 Katherine Thornton, Morane Gruenpeter Pathways for Discovery

768 views • 50 slides

OpenStreetMap and Wikimedia: A quick overview State of the Map 2018 Eugene Alvin Villar

OpenStreetMap and Wikimedia: A quick overview State of the Map 2018 Eugene Alvin Villar [[User:seav]] OpenStreetMap is like Wikipedia for maps OpenStreetMap is like Wikidata for geographical data OpenStreetMap has nodes, ways,

833 views • 30 slides