peer to peer data integration with active xml
play

Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv - PowerPoint PPT Presentation

/56 Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv University Tova Milo Tel Aviv University /56 Active XML - Outline Introduction Active XML Active XML documents Active XML services Novel issues


  1. � /56 Peer-to-Peer Data Integration with Active XML Tova Milo Tel-Aviv University Tova Milo – Tel Aviv University

  2. � /56 Active XML - Outline Introduction Active XML Active XML documents • Active XML services • Novel issues Exchanging Active XML data • Querying Active XML data • Distribution and replication • Security and Access control • Active XML Peers The peer as a client • The peer as a server • Theoretical foundations • Applications Conclusion Tova Milo – Tel Aviv University

  3. � /56 Introduction Tova Milo – Tel Aviv University

  4. � /56 Distributed data management in P2P Information is everywhere Web XML service XML XML services services XML Web services XML XML Data warehouses XML XML Databases Web Web sites service services PC, PDA, cell phones, home appliances, cars… Tova Milo – Tel Aviv University

  5. � /56 The golden triangle of distributed data management XML XML a standard for data representation & exchange • Query languages XPath, XQuery • Web services XQuery Web standards for distributed computing • XPath services Activation of methods on remote servers • Tova Milo – Tel Aviv University

  6. � /56 What is Active XML (AXML)? AXML is a declarative language for distributed information management and an infrastructure to support this language, in a peer-to-peer framework. Tova Milo – Tel Aviv University

  7. � /56 Active XML Tova Milo – Tel Aviv University

  8. � /56 Active XML documents XML documents with embedded calls to Web services Intensional • Some of the data is given explicitly • Some is given intensionally (i.e. the means to acquire data when needed are given) Dynamic • If the external sources change, the same document will provide different information • Reaction to world changes Tova Milo – Tel Aviv University

  9. � /56 Not a new idea in databases, nor on the Web Mixing calls to data is an old idea • Procedural attributes in relational systems • Basis of Object-oriented Databases In HTML world • Sun’s JSP, PHP+MySQL Calls to Web services inside XML documents • Macromedia FLEX, Apache Jelly, Microsoft XAML What is new is the exploitation of the idea… Tova Milo – Tel Aviv University

  10. �� /56 A sample AXML document newspaper GetEvents <?xml version=“1.0” ?> GetTemp title <newspaper> date “Exhibits” <title>Le Monde</title> city “06/10/2003” <date>06/10/2003</date> “Paris” <call svc=“Yahoo.GetTemp”> “Le Monde” <city>Paris</city> </call> <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper> AXML documents may contain calls: to any existing Web services • (e-bay.net, google.com…) to any AXML Web services • (to be defined) Tova Milo – Tel Aviv University

  11. �� /56 Materialization newspaper GetEvents <?xml version=“1.0” ?> temp GetTemp title <newspaper> date “Exhibits” <title>Le Monde</title> city “16° C” “06/10/2003” <date>06/10/2003</date> “Paris” <temp>16°C</temp> <call svc=“Yahoo.GetTemp”> “Le Monde” <city>Paris</city> </call> SOAP call <call svc=“TimeOut.GetEvents”> exhibits </call> </newspaper> �� �� We will see later that: Replacing the call by its result is not the only option • Calls are not necessarily RPC-style synchronous invocations • Tova Milo – Tel Aviv University

  12. �� /56 AXML Web services Parameters: AXML data Great flexibility Result: AXML data Distribute computations : by sending as parameters data containing service calls, one can delegate some work to other peers Partial computations : by returning data containing service calls, one can give to the receiver the control of these calls Tova Milo – Tel Aviv University

  13. �� /56 Calling an AXML service newspaper GetEvents exhibits <?xml version=“1.0” ?> temp title <newspaper> date “Exhibits” GetExhibits <title>Le Monde</title> “16° C” “06/10/2003” <date>06/10/2003</date> City <temp>16°C</temp> “Le Monde” “Paris” <exhibits> <call svc=“Yahoo.GetExhibits”> SOAP call <call svc=“TimeOut.GetEvents”> <city>Paris</city> (still…) exhibits </call> </call> </exhibits> </newspaper> �� �� Materialization is a recursive process Termination is an issue Tova Milo – Tel Aviv University

  14. �� /56 Novel issues Tova Milo – Tel Aviv University

  15. �� /56 Active XML - Outline Introduction Active XML Active XML documents • Active XML services • Novel issues Exchanging Active XML data (SIGMOD’03, PODS’05) • Querying Active XML data • Distribution and replication • Security and Access control • Active XML Peers The peer as a client • The peer as a server • Theoretical foundations • Applications Conclusion Tova Milo – Tel Aviv University

  16. �� /56 To call or not to call ? newspaper newspaper GetEvents GetEvents temp temp GetTemp GetTemp date date title title “Exhibits” “Exhibits” city city “06/10/2003” “06/10/2003” “16° “16° C” C” “Le Monde” “Le Monde” “Paris” “Paris” �� �� � Materialization can be performed � by the sender, before sending a document � or by the receiver, after receiving it Tova Milo – Tel Aviv University

  17. �� /56 Why control the materialization of calls? For added functionality , e.g. Intensional data allows to get up-to-date information • For security reasons or capabilities , e.g. I don’t trust this Web service/domain • I don’t have the right credentials to invoke it • It costs money • Maybe the receiver doesn’t know Active XML! • For performance reasons, e.g. A proxy can invoke all the services on behalf of a PDA • … and many more reasons you can think of! Tova Milo – Tel Aviv University

  18. �� /56 How to control it? Using types We extend XML Schema , with intensional types: XMLSchema int Receiver Sender Capabilities Capabilities ACL ACL Cost Cost ... g data ... q f exchange g f q Schema ... ... g g g q r f r f g ... q g g q ... r ... ... ... ... Casting algorithms use signatures of services: WSDL int Tova Milo – Tel Aviv University

  19. �� /56 Rewritings The Goal: Given an AXML document d • a schema s • Can we rewrite d so that it matches s ? Safe rewriting: one that for sure leads to s (we know without making any call) Possible rewriting: one that possibly leads to s (depending on the answers of the services) Tova Milo – Tel Aviv University

  20. �� /56 Results The general problem is undecidable [MSS04] Restrictions on the considered rewritings Left-to-right : No “going back and forth” • K-depth : bound on the nesting of function calls • (Search space still infinite but finitely representable) Under these restrictions We have algorithms to find safe/possible rewritings • They are PTIME (for deterministic schemas) • We can also do it between schemas • Implementation first demo at VLDB 2003 (customizable news syndication) • Tova Milo – Tel Aviv University

  21. �� /56 Active XML - Outline Introduction Active XML Active XML documents • Active XML services • Novel issues Exchanging Active XML data • Querying Active XML data (SIGMOD’04, PODS’05) • Distribution and replication • Security and Access control • Active XML Peers The peer as a client • The peer as a server • Theoretical foundations • Applications Conclusion Tova Milo – Tel Aviv University

  22. �� /56 Querying AXML Data Given a (tree pattern) query: /newspaper[temp > 18°C]/exhibits//exhibit[location=“Le Louvre”] newspaper Materialize the document? exhibits GetEvents temp GetTemp Call only the services that may contribute title “Exhibits” getDate GetExhibits data to the query answer. city “19° C” City “Paris” “Le Monde” “Paris” The problem: Lazy evaluation of service calls To call or not to call, this time when evaluating a query Tova Milo – Tel Aviv University

  23. �� /56 Lazy evaluation Difficulties: Calls can be found everywhere in the document • May appear dynamically (as a result of previous calls) • May become (ir)relevant due to previous invocations • Need to take signatures of calls into consideration • Possible approach: modify the query processor Trigger the calls found on the way • Not so great: • – Computation is blocked – Optimization opportunities are lost Our solution: Drives queries that find the relevant calls (recursively) • Use service signatures to prune irrelevant calls • Parallel call invocations • Pushing queries to capable external sources • Tova Milo – Tel Aviv University

  24. �� /56 Active XML - Outline Introduction Active XML Active XML documents • Active XML services • Novel issues Exchanging Active XML data • Querying Active XML data • Distribution and replication • Security and Access control • Active XML Peers The peer as a client • The peer as a server • Theoretical foundations • Applications Conclusion Tova Milo – Tel Aviv University

Recommend


More recommend