On bridging relational and document-centric data stores John Roijackers and George Fletcher Eindhoven University of Technology BNCOD 2013 Oxford, UK 9 July 2013
Outline 1 Introduction 2 Theoretical framework 3 Experimental framework 4 Empirical analysis 5 Conclusions Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 2 / 29
Outline 1 Introduction 2 Theoretical framework 3 Experimental framework 4 Empirical analysis 5 Conclusions Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 3 / 29
NoSQL Definition An SQL database is a “traditional” database built around the relational model, which can be queried using SQL . A NoSQL database is a database that is not an SQL database. Data is not stored in relations and/or the main query language to retrieve data is not SQL . Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 4 / 29
NoSQL Definition An SQL database is a “traditional” database built around the relational model, which can be queried using SQL . A NoSQL database is a database that is not an SQL database. Data is not stored in relations and/or the main query language to retrieve data is not SQL . Properties of NoSQL databases: Non-relational, schemaless, e.g., “document”-oriented (JSON) Often, distributed architecture sacrifice full Atomicity, Consistency, Isolation, Durability Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 4 / 29
Problem statement Situation: Different types of data for a single application Data is split over an SQL and a (document-centric) NoSQL database Data separation problem: Combine data from both sources Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 5 / 29
Problem statement SQL data: Places id name population stations 1 New York City 8 . 2 M 468 2 Paris 2 . 2 M 300 Table: Relational data NoSQL data: { name : Alexis , age : 37 , father : { name : Alexis , place : 2 }} { name : Bob , gender : male , age : 37 , father : { name : Bob , place : 1 }} Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 6 / 29
Problem statement SQL data: Places id name population stations 1 New York City 8 . 2 M 468 2 Paris 2 . 2 M 300 Table: Relational data NoSQL data: { name : Alexis , age : 37 , father : { name : Alexis , place : 2 }} { name : Bob , gender : male , age : 37 , father : { name : Bob , place : 1 }} Query For people with a known gender, 37 years old, and named after their father, give the father’s name and place name. Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 6 / 29
Solution direction Query Result Abstraction SQL NoSQL Figure: Desired database abstraction architecture Solution directions: Separate software layer Virtually load SQL data in NoSQL Virtually load NoSQL data in SQL Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 7 / 29
Solution direction Query Result Abstraction SQL NoSQL Figure: Desired database abstraction architecture Solution directions: Separate software layer Virtually load SQL data in NoSQL Virtually load NoSQL data in SQL Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 7 / 29
Solution direction Advantages of loading NoSQL data in SQL : Mature basis for the hybrid database SQL is a well-known standard SQL database can take care of more advanced tasks Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 8 / 29
Solution direction Advantages of loading NoSQL data in SQL : Mature basis for the hybrid database SQL is a well-known standard SQL database can take care of more advanced tasks General approach: Reconstruct the NoSQL data in SQL Extend SQL to support NoSQL data Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 8 / 29
Related work Logical representation “First-order normal form” (Litwin et al., SIGMOD Rec. 1991) SQL query pattern extensions RDF querying in Oracle (Chong et al., VLDB 2005) Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 9 / 29
Related work Logical representation “First-order normal form” (Litwin et al., SIGMOD Rec. 1991) SQL query pattern extensions RDF querying in Oracle (Chong et al., VLDB 2005) Very early days for querying document-centric stores Atzeni, Bugiotti, Rossi (EDBT 2012, CAiSE 2012) Chasseur, Li, Patel (WebDB 2013) Standardization of document query languages UnQL JSONiq Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 9 / 29
Outline 1 Introduction 2 Theoretical framework 3 Experimental framework 4 Empirical analysis 5 Conclusions Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 10 / 29
NoSQL data representation { name: Alexis, NoSQL F age: 37, id key value father: { i 1 name Alexis name: Alexis, i 1 age 37 place: 2 i 1 father i 2 } i 2 name Alexis }, i 2 place 2 { i 3 name Bob name: Bob, i 3 gender male gender: male, i 3 age 37 age: 37, i 3 father i 4 father: { Bob i 4 name name: Bob, 1 i 4 place place: 1 } Table: Logical NoSQL relation } Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 11 / 29
Query language Query For people with a known gender, 37 years old, and named after their father, give the father’s name and place name. NoSQL query pattern: ( name: ?n, gender: ?g, age: 37, father: ( name: ?n, place: ?p ) ) Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 12 / 29
Query language SELECT 1 f.n, 2 p.name 3 FROM 4 NoSQL( 5 name: ?n, 6 Include in SQL query: gender: ?g, 7 age: 37, 8 Alias pattern and treat as father: ( 9 relation name: ?n, 10 place: ?p 11 Use variable names as ) 12 attributes in SQL ) AS f, 13 places AS p 14 WHERE 15 f.p = p.id 16 Listing 1: Example SQL query with included NoSQL query pattern Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 13 / 29
Query translation General relation construction with renaming: F t = ρ F t ( i t , k t , v t ) ( F ) Add selection conditions to F t based on pattern: NoSQL query pattern Translation Selection ( name: ?n, F name σ key=name gender: ?g, F gender σ key=gender age: 37, F age σ key=age ∧ value=37 father: ( F father σ key=father name: ?n, F father,name σ key=name place: ?p F father,place σ key=place ) ) Table: Query translation example Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 14 / 29
Query translation NoSQL query pattern ( name : ?n , gender : ?g , age : 37 , father : ( name : ?n , place : ?p)) Join relations using id attributes: F name ⋊ ⋉ i name = i gender F gender ⋊ ⋉ i gender = i age F age ⋊ ⋉ i age = i father F father ⋉ v father = i father,name ⋊ F father,name ⋊ ⋉ i father,name = i father,place F father,place Variables in pattern: Replace with corresponding F relation attribute Add equality condition if the same variable is reused Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 15 / 29
Query translation π v name , p . name ( σ v name = v father,name ( SELECT F name 1 f.n, 2 ⋊ ⋉ i name = i gender p.name 3 F gender FROM 4 NoSQL( ⋊ ⋉ i gender = i age 5 name: ?n, F age 6 gender: ?g, 7 ⋊ ⋉ i age = i father age: 37, 8 F father father: ( 9 name: ?n, ⋊ ⋉ v father = i father,name 10 place: ?p F father,name 11 ) 12 ⋊ ⋉ i father,name = i father,place ) AS f, 13 F father,place places AS p 14 ) WHERE 15 f.p = p.id 16 ⋊ ⋉ v father,place = p . id ρ p ( places ) ) Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 16 / 29
Query processing strategies Data filtering Selection pushdown Let F only contain NoSQL data that matches the pattern Selection conditions of F relation copies can be combined Projection pushdown Only variables in NoSQL query pattern can be used in query Exclude unnecessary records in F and relations from the query Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 17 / 29
Query processing strategies Data filtering Selection pushdown Let F only contain NoSQL data that matches the pattern Selection conditions of F relation copies can be combined Projection pushdown Only variables in NoSQL query pattern can be used in query Exclude unnecessary records in F and relations from the query Temporary relation First query to materialize records in temporary relation T Let each F t use T instead of F Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 17 / 29
Query processing strategies Data filtering Selection pushdown Let F only contain NoSQL data that matches the pattern Selection conditions of F relation copies can be combined Projection pushdown Only variables in NoSQL query pattern can be used in query Exclude unnecessary records in F and relations from the query Temporary relation First query to materialize records in temporary relation T Let each F t use T instead of F Tuple reconstruction facilitate nested document reconstruction by extending F with an additional document id field include equality selection on this field for each F used during query processing Roijackers & Fletcher (TU Eindhoven) Bridging SQL and NoSQL BNCOD 2013 17 / 29
Recommend
More recommend