homer a case study of federation among open data portals

Homer: a case study of federation among open data portals Nives - PowerPoint PPT Presentation

Homer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it The initiative of Piedmont Region Regional law on Open Data Guidelines for reuse Adoption of a standard licence model

  1. Homer: a case study of federation among open data portals Nives Alciato - CSI Piemonte nives.alciato@csi.it

  2. The initiative of Piedmont Region • Regional law on Open Data • Guidelines for reuse • Adoption of a standard licence model • Creation of a working group • Diffusion to other Public Administrations • Reuse at national level • European Projects • Metadata catalogues • Data uploading platform • A portal as an access point for data and information

  3. Legal framework Regional Law n. 24 dated 23/12/2011 • First regional law in Italy on Open Data Basic principle: • Data belong to people Cornerstones of reusability of data: • Diffusion without restriction and in open and standard digital formats • Use of standard legal tools Creative Common Licences • Re-use and re-distribution of data is free of charge

  4. Organizational framework Regional level An initiative whith ANCI Piemonte (association of municipalities): dati.piemonte.it is the infrastructure for all the regional territory (120 Municipalities and other bodies like ARPA Piemonte and Unioncamere) National level Re-use of the platform and joint project with Emilia Romagna Region and Milano Municipality European level HOMER project to transfer methodological / technical standards and increase circulation and re-use of public data OPENDAI project to improve a new architectural model to increase digital services and business opportunities

  5. Technological framework: a permanent beta • Harmonize policies and licenses for the re-use of Si riesce a trasformare queste scatole data Search Portal in una grafica più carina? • Federation of Open Data Portal New p w pla latform rm from Op Open dat ata a DATA to Dat Data S a Services an and a F a Federat ated • Open data silos PA • Cloud architecture Searc Se rch Engin gine • Open data Services Operational data bases of PAs

  6. HOMER is the acronym of Harmonising Open data in the MEditerranean through better access and Reuse of public sector information www.homerproject.eu  It is a project within the MED Programme financed by the EU Commission  Implementation Starting date 01/04/2012  Implementation End date 31/03/2015

  7. Who are the Homer’s Partners 13 Partners as territorial government and 6 Partners as technological support Country Partner Mission Spain SARGA - Agencia de Gestion Agraria y Pesquera de Andalucia Territorial Gov. AGAPA - Sociedad Aragonesa de Gestión Agroambiental Territorial Gov. FUNDITEC – Foundation for Development, Innovation and Technology Technical Support France Région Provence-Alpes-Côte d'Azur, Territorial Gov. Région Corse Territorial Gov. AVITEM – Agency for sustainable Mediterranean cities and territories Technical Support FING – Fondation Internet Nouvelle Generation Technical Support Italy Piedmont Region Project Leader Sardinia, Emilia-Romagna and Veneto Regions Territorial Gov. CSI Piemonte Technical Support Slovenia Geodetic Institute Territorial Gov. Montenegro Mediterranean University of Montenegro Territorial Gov. Greek GFOSS – The Greek Free Open Source Software Society Technical Support Crete Decentralized Administration of Crete Territorial Gov. University of Crete Technical Support Cyprus Sewerage Board of Limassol – Amathus Territorial Gov. Malta Local Council Ass. of Malta Gozo Territorial Gov.

  8. HOMER’s objectives a federation of Open Data portals among partners, sharing common datasets related to MED strategic domains ( agriculture, culture, energy, environment, tourism ), ensuring long sustainability and exploiting a huge number of harmonized and federated datasets, enhancing the e- participation and digital market opportunities of the MED citizens CSI Piemonte’s responsabilities in HOMER it is the developer of a Federation of Open Data Portals among partners providing ICT and legal support an and it is the promoter of the reuse of the technological solutions underlying the portal, developed in the context of the project

  9. What we intend for federation of open data portals? “Federation” means the virtual system composed by a software able to collect and retrieve the metadata of published data derived from the 5 categories ( agriculture, culture, energy, environment , tourism ) exposed and searched by Open Data Partners Portals ‘ Look at this symbol: it represents the metadata catalogue

  10. Design, methodology, and approach • Memorandum of Understanding • Definition of a metadata common structure for federation • Use of EuroVoc • The cross lingual search • The federated search multi-language engine • The indexing scenario • The searching scenario

  11. Legal framework - Memorandum of Understanding  Partners have been involved upon signing a Memorandum of Understanding where technological, organizational and legal boundaries have been defined as common understanding for everybody and referring to the Directive 2013/37/EU  It is indicated that all technological components of the solution for the Federation (Index, Semantic Search Engine, Translator) are provided and managed – under the conditions and the coordination of CSI Piemonte – that releases them on the basis of an open source philosophy

  12. Data framework – the metadata structure Each Open Data Portals share metadata common fields: this structure builds the Federated Index CKAN title Dublin description Inspire Core url metadata source package_id DCAT topics language Intersecting the Protocols tags and Directives in the schema, it has been geographic bounding box identified the minimun common set of fields for the refresh date definition of a metadata creation date structure and to federate, spatial scale resolution indipendent from the type of license id dataset geographical or owner alphanumerical

  13. Data framework – the use of EuroVoc (1) Homer, now, speaks 7 languages (spanish, french, italian, slovenian, serbian-montenegrin, greek and english) ​with 4 different alphabets and we must share a dictionary to communicate title iso code 639-1 to identify the description language url metadata source package_id topics language tags geographic bounding box refresh date creation date spatial scale resolution license id owner

  14. Data framework – the use of EuroVoc (2) EuroVoc is a multilingual, multidisciplinary thesaurus of the EU conformant to W3C recommendations and in it a specific concept of the 5 categories involved has the same classification and meaning in the domains and languages Homer’s categories = EuroVoc domains title description iso code 639-1 to url metadata source identify the language package_id Each ODP inserts tags topics in the metadata cards in its own language language WATER without the burden of tags translation νερό The same concept geographic bounding VODA is identified in all box languages вода refresh date AGUA creation date EAU spatial scale resolution ACQUA license id owner

  15. Data framework – The cross lingual search The semantic search multi language engine needs a specific common structure to index and retrieve the metadata of all metadata catalogues of the Homer’s Partners’ Open Data Portals. The search engine is like a librarian who finds books only if the request form is filled out in a specific way Field_0

  16. Technological framework: the federated search multi language engine The technological solution for indexing and searching among all the federated open data portals has 4 components: 1. Fed-Index Homer: the federated index file component containing the complete list of metadata 2. Fed-Translator: the component that translates every tags of the datasets via EuroVoc 3. Fed-Searcher : the centralized semantic search engine component 4. Fed-Loader API : the loader that calls the API o Webservices exposed by each Open Data Portal to create the federated Index Based on the open source project Apache Sorl Released open source on sourceforge

  17. Technological framework: the indexing scenario (1) The indexing process requires that each federated portal exposes the metadata cards of the data using 2 types of url 1 url1 that returns the list of the data id: Package List url2 that returns the attributes for the single data: 2 Package Dataset It is a stand alone process scheduled, which could be nightly

  18. Technological framework: the indexing scenario (2) Opendata Portals Voda Scheduled Eau Water Search Engine Agua

  19. Technological framework: the indexing scenario (3) 3 ways supported to expose the metadata: API CKAN compliant: Package List > url1 that returns a xml file1 with the list of the data id Package Dataset > url2 that returns a xml file2 with the attributes for the single data Web services dati.piemonte.it compliant: Package List > url1 that returns a xml file1 with the list of the data id h ttp://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_list2&format =xml&layout=xml Package Dataset > url2 that returns a xml file2 with the attributes for the single data http://www.dati.piemonte.it/index.php?option=com_rd&view=pceli_item2&format =xml&layout=xml&itemid=1083 API C atalogue S ervice for the W eb compliant: Package List > url1 that returns a csw file1 with the list of the data id Package Dataset > url2 that returns a csw file2 with the attributes for the single data


More recommend