provenance only integration
play

Provenance -Only Integration Ashish Gehani Dawood Tariq SRI - PowerPoint PPT Presentation

Provenance -Only Integration Ashish Gehani Dawood Tariq SRI Provenance -Only Integration p. 1/13 Integration Challenges Metadata variation: Abstraction levels Completeness Identifiers Semantics Querying requires: Record assembly


  1. Provenance -Only Integration Ashish Gehani Dawood Tariq SRI Provenance -Only Integration – p. 1/13

  2. Integration Challenges Metadata variation: Abstraction levels Completeness Identifiers Semantics Querying requires: Record assembly Reconciling syntax Mapping semantics Provenance -Only Integration – p. 2/13

  3. Related Work Data integration AnHai Doan, Alon Halevy, and Zachary Ives, Principles of data integration, Elsevier, 2012. Provenance integration Semantic web (Umuhoza 2012) Grid computing (Zhao 2008) System interoperability (Angelino 2011) Cross-organization sharing (Allen 2011) Provenance -Only Integration – p. 3/13

  4. Provenance -Only Integration Single underlying activity Multiple views of it Partial overlap in metadata path:/private/var/log/asl/2014.06.10.U0.G80.asl path:/private/var/log/asl/2014.06.10.U0.G80.asl filename:2014.06.10.U0.G80.asl filename:2014.06.10.U0.G80.asl type:Artifact type:Artifact version:1 version:1 (type:WasGeneratedBy) (time:1402396472245 type:WasGeneratedBy) uid:0 pidname:syslogd gid:0 ppid:1 pid:21 starttime_simple:Thu May 22 18:24:41 2014 type:Process pid:21 type:Process user:root (time:1402400047469 type:Used) (type:Used) path:/private/etc/syslog.conf path:/private/etc/syslog.conf filename:syslog.conf filename:syslog.conf type:Artifact type:Artifact version:0 version:0 Provenance -Only Integration – p. 4/13

  5. Speech Processing Hot Spots Provenance -Only Integration – p. 5/13

  6. Basic Provenance-Only Integration Provenance from two vantage points Need to integrate the two Approach: Define matching threshold τ Merge vertex pair if τ -similar Merge edge pair if τ -similar Cost from conflating owners Goal: Minimize τ Keep cost < tolerance Υ Provenance -Only Integration – p. 6/13

  7. Android Provenance Security analysis System-wide monitoring Resource-constrained Disrupts power management Blinded by garbage collection Multiple abstraction levels Kernel interface Inter-application (Binder) Provenance-only integration Provenance -Only Integration – p. 7/13

  8. Android by Alvaro Fuentes Vasquez via Wikimedia Commons (CC-BY-SA-3.0-2.5-2.0-1.0) Provenance -Only Integration – p. 8/13

  9. Fast Integration Integrate all τ -similar elements Don’t have to find matching pairs Avoids subgraph isomorphism problem Separate vertex, edge matching thresholds Thresholds are input now Cost is per match now Approach: Merge τ v -similar vertices, if cost < Υ Merge τ e -similar edges, if cost < Υ Provenance -Only Integration – p. 9/13

  10. False Integration → High Cost (!" '!" &!" !"#$%&'(% %!" $!" #!" !" %" &" '" (" )" *" +" )*+,#*"-.%&/(% Provenance -Only Integration – p. 10/13

  11. Integration as Abstraction &#!" &!!" !"#$%&'()*&+,*"-.*'/& %#!" %!!" $#!" $!!" #!" !" !" $" %" &" '" #" (" )" *" +" 0%"*'%123&+4/& Provenance -Only Integration – p. 11/13

  12. Fidelity of Attribution ,"-"!" ,"-"#" ,"-"$" ,"-"." #!!" +!" !"#$%&&'$#()*'+,%"-$%&.' *!" )!" (!" '!" &!" %!" $!" #!" !" !" #" $" %" &" '" (" )" *" +" /0"%&0#12'+3.' Provenance -Only Integration – p. 12/13

  13. Conclusion Provenance-only integration Basic form as constrained optimization Fast version → automated abstraction Acknowledgement TaPP ’14 organizers, reviewers US NSF Grant IIS-1116414 URL: http://data-provenance.googlecode.com Email: ashish.gehani@sri.com Questions? Provenance -Only Integration – p. 13/13

Recommend


More recommend