soutenance d valuation mi parcours
play

Soutenance dvaluation mi-parcours Uncertainty over Structured and - PowerPoint PPT Presentation

Research Topic Tractable Probabilistic Data Open-World Query Answering Crowd Data Mining Other Topics Conclusion Soutenance dvaluation mi-parcours Uncertainty over Structured and Intensional Data Antoine Amarilli Tlcom


  1. Research Topic Tractable Probabilistic Data Open-World Query Answering Crowd Data Mining Other Topics Conclusion Soutenance d’évaluation à mi-parcours Uncertainty over Structured and Intensional Data Antoine Amarilli Télécom ParisTech; Institut Mines–Télécom; CNRS LTCI December 4th, 2014 1/41

  2. Research Topic Tractable Probabilistic Data Open-World Query Answering Crowd Data Mining Other Topics Conclusion Background Lots of raw information on the Web Leverage it to answer complex queries 2/41 → Extract structure → Integrate various sources → Manage possible errors → Where can I get a pizza? → Find an afgordable fmat near Télécom with ≥ 20 m 2 ?

  3. Research Topic We cannot collect all information: Choose relevant accesses dynamically Need to access remote data sparingly Tractable Probabilistic Data 3/41 Intensionality Conclusion Other Topics Crowd Data Mining Open-World Query Answering → Storage space → Bandwidth → Access restrictions → Crowdsourcing → Expensive processing → Web crawling → Deep Web → Rule consequences → Web APIs

  4. Research Topic Need to leverage existing structure framework Tractable Probabilistic Data Structure can be heterogeneous 4/41 Structure Conclusion Other Topics Crowd Data Mining Open-World Query Answering → Avoid focusing only on one → Web graph → XML/JSON → Relational DBs → RDF triples → Views → Parse trees

  5. Research Topic Data is imprecise Represent priors on remote data Tractable Probabilistic Data Data is wrong Processing induces uncertainty Uncertainty Conclusion Other Topics Crowd Data Mining Open-World Query Answering 5/41 → Fuzzy rules → Crowdsourcing → Data integration → NLP → Annotations → Information extraction

  6. Research Topic Tractable Probabilistic Data Open-World Query Answering Crowd Data Mining Other Topics Conclusion Use cases Extracting structured facts from an open set of news sources 6/41 → Start with an initial knowledge about the world → Locate promising articles → Run expensive processing on the articles → Uncertainty when accessing, disambiguating → Use crowdsourcing to validate the facts → Using logical rules to constrain them

Recommend


More recommend