on the use of linked open data for trusting web data
play

On the Use of Linked Open Data for Trusting Web Data Davide Ceolin - PowerPoint PPT Presentation

On the Use of Linked Open Data for Trusting Web Data Davide Ceolin and Valentina Maccatrozzo VU University Amsterdam Outline Premises Introduction A Natural History Case Study A Cultural Heritage Case Study Future


  1. On the Use of Linked Open Data for Trusting Web Data Davide Ceolin and Valentina Maccatrozzo VU University Amsterdam

  2. Outline • Premises • Introduction • A Natural History Case Study • A Cultural Heritage Case Study • Future directions • Recap, bibliography, etc.

  3. Premises • Trust ≈ Reliability. • We make no assumption about the intentions of the data creator. • This presentation gives a reflection on past work (see refs. ) and outlines future directions.

  4. Introduction • Trust Management: subjective logic (Jøsang, 2001) • Extends boolean and probabilistic logic. • Reasoning on “opinions” about propositions based on evidence. • Accounts for source and uncertainty (inversely proportional to size of evidence set).

  5. Subjective logic: basics source ω proposition = (b, d, u) • b + d + u = 1 • b ≈ p(proposition) • u inversely proportional to evidence set. • operators: boolean, discounting, fusion...

  6. Trusting Web data using subjective logic • We adopt supervised learning algorithms (when a trainingset is available). • Trust is subjective. • We look for different “opinions” about the data. Subjective logic allows us to handle them. • First we estimate the data trustworthiness, then we select the “best” data (based, e.g. on author reputation).

  7. Using LOD to assist evidential reasoning • LOD provide lots of useful data. • More evidence. • Subjective logic’s distributions ≈ (At least some) LOD datasets distributions (Ceolin et al., 2011).

  8. Museums... Photo: flickr.com/clumsyjim

  9. Museums... ...have a problem. Photo: flickr.com/clumsyjim Photo: flickr.com/grrrl

  10. So they recruit some help... Photo: flickr.com/anirudhkoul

  11. Trusting Museum Annotations • Museums manage large collections. • Several Museums crowdsource annotations. • The quality and accuracy of annotations is crucial for their business. • Can they trust crowdsourced annotations?

  12. A Natural History Case Study specimen1 user1 aves xx ✓ specimen 2 user2 aves xyz ✗ ✓ specimen 3 user1 aves xz1 ✓ specimen 4 user2 aves yy specimen 5 user3 aves zz ✗

  13. A Natural History Case Study specimen1 user1 aves xx ✓ specimen 2 user2 aves xyz ✗ ✓ specimen 3 user1 aves xz1 ✓ specimen 4 user2 aves yy specimen 5 user3 aves zz ✗

  14. A Natural History Case Study specimen1 user1 aves xx ✓ tax author1 specimen 2 user2 aves xyz tax author2 ✗ ✓ specimen 3 user1 aves xz1 tax author1 ✓ specimen 4 user2 aves yy tax author1 specimen 5 user3 aves zz tax author2 ✗

  15. A Natural History Case Study specimen1 user1 aves xx ✓ tax author1 specimen 2 user2 aves xyz tax author2 ✗ ✓ specimen 3 user1 aves xz1 tax author1 ✓ specimen 4 user2 aves yy tax author1 specimen 5 user3 aves zz tax author2 ✗ Increased accuracy, from 53% to 82% on a museum dataset Ceolin et al., 2010

  16. Another Case Study Semantic similarity for weighing evidence. Training set Expertise Tulip New annotation Flower Rose Red Semantic Pink similarity Purple Up to: 84% accuracy, 88% precision, 96% Recall on two museum datasets (Ceolin et al. 2013a)

  17. Future work • We used similar methods (plus other statistical techniques) for analzying the reliability of UK Police Open Data (Ceolin et al., 2013b). • We plan to extend them with LOD, e.g. for: • geodisambiguation; • crime type hierarchies.

  18. Recap • LOD + Evidential reasoning (subjective logic) is a powerful combination for trust (reliability) estimation • enrichment; • weighing. • The more the better, but: • evidence quality counts; • data needs to be tracked (W3C PROV) and properly managed.

  19. Bibliography • Jøsang, A., A logic for uncertain probabilities. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3), pp. 279-311, 2001 • Ceolin, D. van Hage, W. R. Fokkink, W. A Trust Model to Estimate Quality of Annotations using the Web. In WebSci, Web Science Repository, 2010. • Ceolin, D., van Hage, W.R., Fokkink, W., Schreiber, G. Estimating Uncertainty of Categorical Web Data. In URSW, CEUR-ws.org, 2011. • Ceolin, D. Nottamkandath, A. Fokkink, W., Semi-automated Assessment of Annotations Trustworthiness. In PST Conference, IEEE, 2013 • Ceolin, D. Moreau, L. O'Hara, K. Schreiber, G. Sackley, A. Fokkink, W. van Hage, W.R. Shadbolt, N., Reliability Analyses of Open Government Data. In URSW, CEUR-ws.org, 2013 • Ceolin, D. Nottamkandath, A. Fokkink, W. Efficient Semi-automated Assessment of Annotations Trustworthiness In Journal of Trust Management, Springer. (Accepted, 2014)

  20. Thank you! Any question? d.ceolin@vu.nl

Recommend


More recommend