Search Computing November 7, 2011 Stefano Ceri ... and the SeCo Project Team Adnan Abid, Mamoun Abu Helu, Davide Barbieri, Daniele Braga, Marco Brambilla, Alessandro Bozzon, Alessandro Campi, Davide Chicco, Emanuele Della Valle, Piero Fraternali, Nicola Gatti, Giorgio Ghisalberghi, Michael Grossniklaus, Davide Martinenghi, Marco Masseroli, Maristella Matera, Chiara Pasini, Silvia Quarteroni, Marco Tagliasacchi, Luca Tettamanti, Salvatore Vadacca, Serge Zagorac
Motivating Examples – What Search Engines can’t do 2 “ Where can I find a theater close to Union Square, San Francisco, showing a recent thriller movie, close to a good steak house?” Prof. Stefano Ceri Database Management
Search For a Solution Using All Keywords Prof. Stefano Ceri Database Management
Split the task, and search for theaters first Prof. Stefano Ceri Database Management
But there’s no thriller! 5 Try another theater: Found! (The Next Three Days) close enough to Union square.... Prof. Stefano Ceri Database Management
Independent search for steak house Prof. Stefano Ceri Database Management
Done! Close enough! (data integration and ranking in the user’s brain) Prof. Stefano Ceri Database Management
8 VISION Prof. Stefano Ceri Database Management
The Search Computing Project ERC-founded project 5 years – Started in 2009, now at month 36 Build theories, methods and tools to support search- oriented multi-dimensional queries – Given a multi-domain query – Build global solutions by integrating data produced by search services – Rank global solutions according to a global rank function and output results in rank order – Support user-friendly interfaces for query definition and result browsing, which allow adding search domains while the search process proceeds and possibly change the relative weight of each ranking Prof. Stefano Ceri Database Management
Search Computing = Search Service Composition 11 Searching the Web of Data requires demand-driven service composition Composition abstractions should emphasize few elements: service invocations, fundamental operations, precedences, global constraints on execution Data composition should be search-driven – producing few top results very fast Trulia.com real estate Metro.net LocalCensus.com public transit demographics Walkscore.com walkability Pipe Parallel GOOD, 30 results, 5 seconds, 50 calls GOOD, 30 results, 10 calls Prof. Stefano Ceri Database Management
Modular software view of search applications 12 New generation software for building focused search applications Covering the functionalities of vertical search systems (e.g. “expedia”, “amazon”) on more focused application domains (e.g. localized real estate or leasure planning, sector-specific job market offers, support of biomed research, ...) Should be easy-to-build, easy-to-query, easy-to-maintain, easy-to-scale... Prof. Stefano Ceri Database Management
13 TECHNOLOGICAL FRAMEWORK Prof. Stefano Ceri Database Management
Search Computing architecture: overall view 14 Front End High level query “Where can I attend a DB scientific conference close to High-Level Query a beautiful beach reachable Presented results Final User with cheap flights?” Results Query Analysis ESWC-Crete-Olympic Sub query 1 CAISE- Hammamet – Alitalia Sub query 2 “Where can I attend a DB Cache Sub query 3 TOOLS-Malaga-EasyJet “place close to scientific conference?” “place reachable with Sub-queries a beautiful beach?” cheap flight?” Cache Query To Domain Result Low level query 1 Mapper Transformation Cache ConfSearch(“DB”,placeX,dateY) Low level query 2 Low level query 3 TourSearch(“Beach”,PlaceX) Low-level queries Merged Results Flight(“cost<200”,PlaceX,DateY) Results Query Planner Cache Query plan Concrete Query Plan Query Engine WS-Framework Main Query flow Cache OP 1 OP 2 ... OP N Services invocations Cache and operators execution <Uses> relation Domain WS Domain Service Framework World Repository Repository Cache Prof. Stefano Ceri Database Management
Search Computing architecture: incremental prototyping 15 Prototype 4: Front End NL or keyword queries Concrete Query Plan Low-level queries Prototype 3: Sub-queries High-Level Query Ontology-driven search Final User Query Analysis Results • Ontological query interpretation Cache • Ontological description & Sub-queries annotation of services Cache Query To Domain Result Admin Interface Prototype 2: Mapper Transformation Cache Vertical solutions Low-level queries Merged Results • ER Domain description Query Planner • Query planner • Application design tools Cache Concrete Query Plan Prototype 1: Query Engine WS-Framework Core behaviour of the Cache system. OP 1 OP 2 ... OP N Cache • Query engine <Uses> relation • Domain repository • Service repository • Result presentation Domain WS Domain Service Framework World Repository Repository Cache Prof. Stefano Ceri Database Management
16 LIQUID QUERY INTERFACE Prof. Stefano Ceri Database Management
Liquid query definition It consists of subsetting and parametrizing the resource graph... News News Restaurant Restaurant Exhibition Exhibition ... Piece ... Concert Concert ... Artist Artist ... Photo Photo ... Hotel Hotel Movie ... Metro Station Metro Station Theatre Landmark ... ShoppingCenter ... = inputs, outputs + GR = global ranking Prof. Stefano Ceri Database Management
Liquid query definition ... And then characterizing the user interaction News Restaurant Exhibition Concert Artist Photo Hotel Expand Plus: Metro Station • Parametrization of global ranking • Data visualization options • .. and so on Prof. Stefano Ceri Database Management
Exploration of the Service Space Entity Selection Prof. Stefano Ceri Database Management
Exploration of the Service Space Service Selection Entity Selection Prof. Stefano Ceri Database Management
Exploration of the Service Space Service Selection Entity Selection Query !! Prof. Stefano Ceri Database Management
Result Presentation Tabular Representation Order Ranking Local Filter Bar Projection Prof. Stefano Ceri Database Management
Result Presentation (Map) 23 Prof. Stefano Ceri Database Management
Exploration options from a given state 24 Related Entities Prof. Stefano Ceri Database Management
Result Presentation (Atom View) 25 Real Estate Doctor Service Service Association Prof. Stefano Ceri Database Management
Result Visualization – Combinations on Maps 26 Prof. Stefano Ceri Database Management
27 SERVICE REGISTRATION Prof. Stefano Ceri Database Management
Rationale of Service Registration • Providing a “Semantic Resource Framework” (SRF) where concepts of the real world are mapped to entities and interconnected by relationships • Along the idea of the “web of objects” instead of the “web of pages” News Restaurant Exhibition ... Piece ... Concert ... Artist ... Photo ... Hotel Movie ... Metro Station Theatre Landmark ... ShoppingCenter ... Prof. Stefano Ceri Database Management
Under the scene... 29 � Prof. Stefano Ceri Database Management
Service Framework in SeCo • A Service Description Framework coupled with a Semantic Annotation Framework Capturing of service semantics via Conceptual representation: Knowledge Base Semantic Annotation Framework group services by core entities lookup Service Description Framework Reference KB Service Mart Logical representation: Entities/relationships i/o fields and transitions as “ mentioned ” in SDF domain entities/relationships Access Pattern Domain Diagram Physical representation: as shipped by data provider Service Interface Prof. Stefano Ceri Database Management
Semantic Framework: Domain Diagram and Access Patterns 32 TheatrebyMovie Current Theatre Actor Movie Date ActorByTitle Current Position TheaterByFLD MovieByTitle PrizeByDirector Domain concept Film_Director Prize Access Pattern Prof. Stefano Ceri Database Management
33 SECO ENGINE Prof. Stefano Ceri Database Management
The Query Processor in the Big Picture Prof. Stefano Ceri Database Management
The Query Processor in the Big Picture Workbench Query testing Processor tool Prof. Stefano Ceri Database Management
SeCoQL The Query Processor in the Big Picture Logical Level Movie Restaurant IN OUT Theatre Workbench Query testing Processor tool Physical Level Panta Rhei Prof. Stefano Ceri Database Management
SeCoQL An old drama movie showing tonight in a theatre close to a good restaurant NightOut(Piccadilly, London, UK) Prof. Stefano Ceri Database Management
First step: from conjunctive queries to logical plans Generation of a logical plan Movie Restaurant IN OUT Theatre Prof. Stefano Ceri Database Management
Second step: from logical to physical query plans Movie Restaurant IN OUT Theatre Then the planner generates a physical, executable query plan, expressed in Panta Rhei Movie (1,1,T) Restaurant Theater (1,10,R) Prof. Stefano Ceri Database Management
Recommend
More recommend