A Functorial Query Language Ryan Wisnesky , David Spivak Department of Mathematics Massachusetts Institute of Technology { wisnesky , dspivak } @math.mit.edu Presented at Boston Haskell April 16, 2014
Outline § Introduction to FQL. § FQL is a database query language based on category theory . § But, there will be no category theory in this talk. § How to program FQL using Haskell. § FQL provides an alternative semantics for Haskell programs. § If you can program Haskell, you can program FQL. § Demo of the FQL IDE. § Project webpage: categoricaldata.net/fql.html 2 / 30
� � � Introduction to FQL § In FQL, a database schema is a special kind of entity-relationship (ER) diagram. manager worksIn Emp Dept ‚ ‚ secretary first last name ˝ ˝ ˝ Emp . manager . worksIn “ Emp . worksIn Dept . secretary . worksIn “ Dept Emp Dept ID mgr works first last ID sec name 101 103 q10 Al Akin q10 102 CS 102 102 x02 Bob Bo x02 101 Math 103 103 q10 Carl Cork 3 / 30
� � � Introduction to FQL manager worksIn Emp Dept ‚ ‚ secretary first last name ˝ ˝ ˝ Emp . manager . worksIn “ Emp . worksIn Dept . secretary . worksIn “ Dept § Each black node represents an entity set (of IDs). § Each directed edge represents a foreign key. § Each open circle represent an attribute. § Data integrity constraints are path equalities. § Data is stored as tables in the obvious way. 4 / 30
Why FQL? § FQL is a language for manipulating the schemas and instances just defined. § But you can also manipulate such schemas and instances using SQL. § We assert that, because of its categorical roots, FQL is a better language for doing so. § FQL is “database at a time”, not “table at a time”. § FQL operations necessarily respect constraints. § Unlike SQL, FQL is expressive enough to be used for information integration (see papers). § Parts of FQL can run on SQL, and vice versa. 5 / 30
FQL Basics § A schema mapping F : S Ñ T is a constraint-respecting mapping: nodes p S q Ñ nodes p T q edges p S q Ñ paths p T q and it induces three data migration operations: § ∆ F : T -inst Ñ S -inst (like projection) § Σ F : S -inst Ñ T -inst (like union) § Π F : S -inst Ñ T -inst (like join) 6 / 30
∆ (Project) Name Name ˝ ˝ Salary Salary ˝ ˝ F Ý Ý Ý Ñ N1 N2 N ‚ ‚ ‚ Age Age ˝ ˝ N1 N2 N ID Name Salary ID Age ID Name Age Salary ∆ F 1 Bob $250 1 20 1 Bob 20 $250 Ð Ý Ý 2 Sue $300 2 20 2 Sue 20 $300 3 Alice $100 3 30 3 Alice 30 $100 7 / 30
Π (Join) Name Name ˝ ˝ Salary Salary ˝ ˝ F Ý Ý Ý Ñ N1 N2 N ‚ ‚ ‚ Age Age ˝ ˝ N ID Name Age Salary 1 Alice 20 $100 N1 N2 2 Alice 20 $100 ID Name Salary ID Age 3 Alice 30 $100 Π F 1 Bob $250 1 20 4 Bob 20 $250 Ý Ý Ñ 2 Sue $300 2 20 5 Bob 20 $250 3 Alice $100 3 30 6 Bob 30 $250 7 Sue 20 $300 8 Sue 20 $300 9 Sue 30 $300 8 / 30
Σ (Union) Name Name ˝ ˝ Salary Salary ˝ ˝ F Ý Ý Ý Ñ N1 N2 N ‚ ‚ ‚ Age Age ˝ ˝ N ID Name Age Salary N1 N2 1 Alice null $100 ID Name Salary ID Age 2 Bob null $250 Σ F 1 Bob $250 1 20 Ý Ý Ñ 3 Sue null $300 2 Sue $300 2 20 4 null 20 null 3 Alice $100 3 30 5 null 20 null 6 null 30 null 9 / 30
Foreign keys Name Name ˝ ˝ Salary Salary ˝ ˝ F Ý Ý Ý Ñ N1 f � N2 N ‚ ‚ ‚ Age Age ˝ ˝ N1 N2 N ∆ F ID Name Salary f ID Age ID Name Age Salary Ð Ý Ý Π F , Σ F 1 Bob $250 1 1 20 1 Alice 20 $100 Ý Ý Ý Ý Ý Ñ 2 Sue $300 2 2 20 2 Bob 20 $250 3 Alice $100 3 3 30 3 Sue 30 $300 10 / 30
FQL Summary § FQL provides a “database at a time” query language for certain kinds of relational databases. § For the categorically inclined, roughly: § Schemas are finitely-presented categories. § Schema mappings are functors. § Instances are functors to the category of sets. § The instances on any schema form a category. § p Σ F , ∆ F q and p ∆ F , Π F q are adjoint functors. 11 / 30
Programming FQL Schemas and Mappings using Haskell § By Haskell, I mean the the simply-typed λ -calculus (STLC): § Types t : t :: “ 0 | 1 | t ` t | t ˆ t | t Ñ t § Expressions e : e :: “ v | λv : t.e | ee | pq | fst e | snd e | p e, e q | K | inl e | inr e | p e ` e q § Equations: fst p e, f q “ e snd p e, f q “ f p λv : t.e q f “ e r v ÞÑ f s ... § Theorem: FQL schemas and mappings are a model of the STLC. § Given an STLC type t , you get an FQL schema r t s . § Given an STLC term Γ $ e : t , you get an FQL schema mapping r e s : r Γ s Ñ r t s 12 / 30
Programming FQL Schemas using Haskell § The empty type, 0 , (in Haskell, data Empty = ), becomes a schema with no nodes: § The unit type, 1 , (in Haskell, data Unit = TT ), becomes a schema with one node: TT ‚ 13 / 30
Programming FQL Schemas using Haskell § Sum types, t ` t 1 , (in Haskell, Either t t’ ), are given by addition: inl a inl b inl c ‚ ‚ ‚ a b c ` d e “ ‚ ‚ ‚ ‚ ‚ inr d inr e ‚ ‚ § Product types, t ˆ t 1 , (in Haskell, (t,t’) ), are given by multiplication: p a , d q p b , d q p c , d q ‚ ‚ ‚ a b c ˆ d e “ ‚ ‚ ‚ ‚ ‚ p a , e q p b , e q p b , e q ‚ ‚ ‚ 14 / 30
Programming FQL Schemas using Haskell § Function types, t Ñ t 1 are given by exponentiation: a b c Ñ d e “ ‚ ‚ ‚ ‚ ‚ p a ÞÑ d , b ÞÑ d , c ÞÑ d q p a ÞÑ e , b ÞÑ d , c ÞÑ d q ‚ ‚ p a ÞÑ d , b ÞÑ e , c ÞÑ d q p a ÞÑ d , b ÞÑ d , c ÞÑ e q ‚ ‚ p a ÞÑ e , b ÞÑ e , c ÞÑ d q p a ÞÑ d , b ÞÑ e , c ÞÑ e q ‚ ‚ p a ÞÑ e , b ÞÑ d , c ÞÑ e q p a ÞÑ e , b ÞÑ e , c ÞÑ e q ‚ ‚ 15 / 30
� � � Programming FQL Schemas using Haskell § Constant types, corresponding to user defined types in Haskell, are simply schemas: manager worksIn Emp Dept ‚ ‚ secretary § The operations ˆ , ` , Ñ behave correctly with respect to foreign keys. § Hence, STLC types translate to FQL schemas. 16 / 30
� � � � � � Programming FQL Mappings using Haskell § In Haskell, we have K :: a . In FQL, we have a mapping K : 0 Ñ a : manager K worksIn Ý Ñ Emp Dept ‚ ‚ secretary § In Haskell, we have pq :: 1 . In FQL, we have a mapping pq : a Ñ 1 : manager pq worksIn TT Ý Ñ Emp Dept ‚ ‚ ‚ secretary 17 / 30
Programming FQL Mappings using Haskell § In Haskell, we have inl :: a Ñ a ` b and inr :: b Ñ a ` b . inl a inl b inl c ‚ ‚ ‚ inl,inr a b ‚ ` c d e Ý Ý Ý Ý Ñ ‚ ‚ ‚ ‚ inr d inr e ‚ ‚ § In Haskell, we have fst :: a ˆ b Ñ a and snd :: a ˆ b Ñ b . p a , d q p b , d q p c , d q ‚ ‚ ‚ fst,snd a b c ˆ d e Ð Ý Ý Ý Ý ‚ ‚ ‚ ‚ ‚ p a , e q p b , e q p c , e q ‚ ‚ ‚ 18 / 30
Programming FQL Mappings using Haskell § We can translate the other STLC operations too: § If f :: t Ñ a and g :: t Ñ b , we need p f, g q :: t Ñ a ˆ b . § This is pairing. § If f :: a Ñ t and g :: b Ñ t , we need p f ` g q :: a ` b Ñ t . § This is case . § If f :: a ˆ b Ñ c , we need Λ f : a Ñ p b Ñ c q . § This is usually called curry . § We need ev :: p a Ñ b q ˆ b Ñ a . § This is function application. § All FQL operations obey the required equations, fst p a, b q “ a snd p a, b q “ b ... § And the FQL operations work correctly with foreign keys. § Hence, FQL mappings are a model of the STLC. 19 / 30
Retrospective § STLC types and terms, FQL schemas and mappings, and even sets and functions between them, are all bi-cartesian closed categories . § Haskell programmers will eventually encounter category theory, starting with bi-cartesian closed categories. § That theory can be put to use in other places, namely databases. § In fact, as we will see next, for every FQL schema S , the category of S -instances is also bi-cartesian closed. 20 / 30
Programming FQL Instances and Morphisms using Haskell § By Haskell, I mean the the simply-typed λ -calculus (STLC): § Types t : t :: “ 0 | 1 | t ` t | t ˆ t | t Ñ t § Expressions e : e :: “ v | λv : t.e | ee | pq | fst e | snd e | p e, e q | K | inl e | inr e | p e ` e q § Equations: fst p e, f q “ e snd p e, f q “ f p λv : t.e q f “ e r v ÞÑ f s ... § Theorem: For each schema S , the FQL S -instances and S -homomorphisms are a model of the STLC. § A database homomorphism is a map of IDs to IDs. § Given an STLC type t , you get an FQL S -instance r t s . § Given an STLC term Γ $ e : t , you get an FQL S -homomorphism r e s : r Γ s Ñ r t s 21 / 30
Programming FQL Instances using Haskell § Let S be the schema a � b f ‚ ‚ § The empty type, 0 , (in Haskell, data Empty = ), becomes an S instance with no data: a b ID f ID § The unit type, 1 , (in Haskell, data Unit = TT ), becomes an S instance with one ID per table: a b ID f ID 1 1 1 22 / 30
Recommend
More recommend