DILIGENT Digital libraries powered by the Grid Bhaskar Mehta, Fhg IPS I, Germany Work Package Leader, Content and Metadata Management Bhaskar.Mehta@ ipsi.fraunhofer.de
Overview Introduction to DILIGENT Grid: Oppurtunity and Challenge Challenges in Information Management Data Management in DILIGENT Open Issues and Next S teps International Symposium on Grid Computing, Taipei, 3rd May 2006 2
Introduction to DILIGENT DILIGENT: A Digital Library Infrastructure on Grid-Enabled Technology Duration: 3 years Commencement Date: S eptember 2004 Effort: 1024 p/ m Cost: 9.8 M Euro European Union funding: 6.3 M Euro International Symposium on Grid Computing, Taipei, 3rd May 2006 3
Partners Consiglio Nazionale delle Ricerche – ISTI (Italy, S cientific Co-ordinator) European Research Consortium for Informatics and Mathematics (France, Adm Coordinator) University of Athens (Greece) Fraunhofer-Gesellschaft zur F ö rderung der angewandten Forschung e.V. – IPS I (Germany) University for Health Informatics and Technology Tyrol (Austria)/ ETH Z ü rich/ UNI Basel University of S trathclyde (United Kingdom) Engineering Ingegneria Informatica S pA (Italy) Fast S earch & Transfer AS A (Norway) 4D S OFT S oftware Development Ltd. (Hungary) European Organization for Nuclear Research (S witzerland) pace Agency – ESA (Italy) European S S cuola Normale S uperiore (Italy) RAI Radio Televisione Italiana (It aly) International Symposium on Grid Computing, Taipei, 3rd May 2006 4
DILIGENT Objectives To creat e an advanced test-bed t hat will allow members of dynamic virt ual e-S cience organizat ions t o access shared knowledge and t o collaborat e in a secure, coordinat ed, dynamic and cost -effect ive way. Expect ed Out come A Digit al library infrast ruct ure which is Grid based Grid Jobs A t est bed based on t his infrast ruct ure Two implement ed S cenarios: Eart h S cience and Cult ural Herit age International Symposium on Grid Computing, Taipei, 3rd May 2006 5
Motivation for NGDLs: Digital library challenges Cost & Time Construction and management of a DL requires s high investments and specialized person high investments and specialized personn nel el Construction and management of a DL require Years are spent in designing and setting up a DL Years are spent in designing and setting up a DL S S hared Infrastructure with Authoring Capabilities hared Infrastructure with Authoring Capabilities New functionality is computat ionally expensive and evolving Multimedia indexing, clustering: e.g. LS I, pLS A Multimedia indexing, clustering: e.g. LS I, pLS A Multimedia querying: e.g. Image retrieval by feature vectors Multimedia querying: e.g. Image retrieval by feature vectors Multimedia processing: e.g. Satellite images, Partial encrpytion for video Multimedia processing: e.g. Satellite images, Partial encrpytion for video S ervice Based Digital Libaries, with process management/ distribution support tion support S ervice Based Digital Libaries, with process management/ distribu Heterogeneity & Distribution DLs (and underlying components) use different models, apis, data formats, etc DLs (and underlying components) use different models, apis, data formats, etc DLs are distributed/ replicated DLs are distributed/ replicated Basing DLs on standards Basing DLs on standards Providing support for federated/ distributed search, data brokering ing Providing support for federated/ distributed search, data broker International Symposium on Grid Computing, Taipei, 3rd May 2006 6
Grid as an Oppurtunity... and a Challenge Grid Grid Potential DL DL Digital Libary Libary Digital Challenge Grid Grid Grid OS OS International Symposium on Grid Computing, Taipei, 3rd May 2006 7
Some Methodological Challenges S ervice Oriented, Distributed Architecture Requires open systems for Requires open systems for indexing, searching, feature extraction, metadata management indexing, searching, feature extraction, metadata management Distributed S earch Query Optimization Query Optimization S emantic Data Integration S emantic Data Integration On Demand S ervice Activation S atellite Images S atellite Images Extraction Extraction Virtual Organizations Content S Content S ecurity ecurity Resource S ecurity Resource S ecurity International Symposium on Grid Computing, Taipei, 3rd May 2006 8
Some Technological Challenges Grid Technology is File centric: DLs are collection centric Metadata mmgt with the Grid: Based on key-value pairs Lack of support for structered data (e.g. XML) Lack of support for structered data (e.g. XML) Retrieval S upport is limited Retrieval S upport is limited Availibility and Replication S upport: file based Real time processing vs batch processing DL users require instantaneous response (ala Google) DL users require instantaneous response (ala Google) Grid processes usually can‘ ‘ t provide real time response t provide real time response Grid processes usually can International Symposium on Grid Computing, Taipei, 3rd May 2006 9
10 DILIGENT Architecture International Symposium on Grid Computing, Taipei, 3rd May 2006
Data Management in DILIGENT (1) Common functionality for Content and Metadata management � Effort duplication S torage , , Replication Replication S torage Change Notification Change Notification, Association , Association Consistancy Consistancy gLite functionality S eperate pipelines pipelines for for Content Content and and Metadata Metadata S eperate Incomplete functionality functionality ( (e.g e.g. . replication replication) ) Incomplete Insufficient for Insufficient for DILIGENT DILIGENT FileS ystem vs vs Data Data Model Model FileS ystem Flat records records vs vs XML XML Flat CM MM CM MM CM MM Emulation Common Layer Common Layer gLite XMLDB gLite gLite International Symposium on Grid Computing, Taipei, 3rd May 2006 11
Data Management in DILIGENT (2) Indentifying 3 basic layers Base layer layer : : glite glite functionality functionality Base (S (S E, E , Catalog Catalog, FTS , FTS ) ) Content Management Storage S torage Layer Layer: ( : (Replication Replication, , S torage Management change notification notification, , change transactional support transactional support ) ) Base Layer S ervice Layer Layer: S : S ervice specific specific S ervice ervice functionality, API/ WS , API/ WS view view. . functionality Layer Metadata Management International Symposium on Grid Computing, Taipei, 3rd May 2006 12
Data management in DILIGENT (3) API / WSDL API / WSDL Metadata Management Annotation Manager Processor Content Metadata Query Broker Manager Content Security Metadata Catalog Storage Layer Base Layer International Symposium on Grid Computing, Taipei, 3rd May 2006 13
Current Status and Future Steps Detailed design has been completed APIs under implementation 1st Experimental prototype based on OpenDLib Extensive testing and deployment of gLite 1.1 -> 3.0 Next S teps Integrate finished components Integrate finished components Deploy Diligent on the Grid Infrastructure Deploy Diligent on the Grid Infrastructure Develop prototypes based on Diligent Develop prototypes based on Diligent Testing & User Feedback Testing & User Feedback International Symposium on Grid Computing, Taipei, 3rd May 2006 14
Types of Involvement for Observers Information about proj ect activities (www.diligentproj ect.org) Involvement in workshops Possible involvement in validation Feedback for DILIGENT development Candidates for adoption of DILIGENT infrastructure International Symposium on Grid Computing, Taipei, 3rd May 2006 15
Contact us Co-operation with other proj ects/ communities is welcome www.diligentproject.org Contact people: • Donatella Castelli, Pasquale Pagano, IS TI-CNR donatella.castelli/ pasquale.pagano@ isti.cnr.it • Jessica Michael, ERCIM j essica.michel@ ercim.org • Bhaskar Mehta, Fraunhofer IPS I bhaskar.mehta@ ipsi.fraunhofer.de International Symposium on Grid Computing, Taipei, 3rd May 2006 16
Questions ? Questions ? International Symposium on Grid Computing, Taipei, 3rd May 2006 17
Research today Research is carried out by groups of individuals, belonging to different institutions, that dynamically aggregate to carry out proj ects together By sharing their resources these individuals create better conditions for their research Digital libraries that maintain the produced knowledge and make it accessible worldwide are becoming key instruments for scientific collaboration in many research areas International Symposium on Grid Computing, Taipei, 3rd May 2006 18
Recommend
More recommend