advanced data modelling in postgresql
play

Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH - PowerPoint PPT Presentation

Introduction Basics Derivations Extractions Types Inheritance Thank you Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH May 16, 2020 Introduction Basics Derivations Extractions Types Inheritance Thank you About Me


  1. Introduction Basics Derivations Extractions Types Inheritance Thank you Advanced Data Modelling in PostgreSQL Chris Travers Adjust GmbH May 16, 2020

  2. Introduction Basics Derivations Extractions Types Inheritance Thank you About Me • New contributor to PostgreSQL (one bugfix so far) • Heads the PostgreSQL-related R&D at Adjust GmbH • Long-time PostgreSQL user (since 1999) • Been around the community for a long time.

  3. Introduction Basics Derivations Extractions Types Inheritance Thank you About Adjust We are big PostgreSQL users. Over 10PB of data, with near-real-time analytics on 1PB of raw data and 400,000 inbound requests per second. We provide mobile advertisement attribution and analytics services to companies who buy advertising.

  4. Introduction Basics Derivations Extractions Types Inheritance Thank you What We Will Cover • Derived Data Modelling • Custom Data Types • Data Extracted from Semi-Structured Data • Inheritance and Polymorphic Relationships • Multiple Inheritance and Overlapping Sets

  5. Introduction Basics Derivations Extractions Types Inheritance Thank you Quick Refresher of RDBMS Model • Relation as a set of corresponding facts • Relation entry modelled as tuple • Functional dependencies important for inference derivation

  6. Introduction Basics Derivations Extractions Types Inheritance Thank you RDBMSs as Inferential Systems • Functional dependencies imply functions • Inferences are derived via function composition

  7. Introduction Basics Derivations Extractions Types Inheritance Thank you ORDBMS Goals Example Query SELECT * FROM photos p JOIN locations l ON distance(p.pt, l.pt) < ’100km’::distance WHERE is sunset(p.graphics) • Distance calculation? • Distance type? • Detecting sunsets?

  8. Introduction Basics Derivations Extractions Types Inheritance Thank you What is needed? • Custom data types to normalize and compare • Custom calculations • Pluggable calculations based on arbitrary data

  9. Introduction Basics Derivations Extractions Types Inheritance Thank you Calculated Inferences • All inferences are calculated. • Some inferences require traversing known facts • Other inferences can be calculated based on a single known fact or a single relation entry. This section is about the last category.

  10. Introduction Basics Derivations Extractions Types Inheritance Thank you Examples of tuple-derived Inferences • tsvectors for searching notes with a subject, author, and contents • Area of a polygon • Bounding box of a polygon Given x, we get one and exactly one y, and we can calculate it knowing only x.

  11. Introduction Basics Derivations Extractions Types Inheritance Thank you Example: Searchable Notes Table What we want create table note ( SELECT * from note id serial primary key, WHERE plainto tsquery( subject text not null, ’Postgres is great’ author text not null, ) @@ tsvector(note) contents text not null, - - or note.tsvector ... );

  12. Introduction Basics Derivations Extractions Types Inheritance Thank you The Function Function CREATE OR REPLACE FUNCTION tesvector(note) RETURNS TSVECTOR LANGUAGE SQL IMMUTABLE AS $$ SELECT to tsvector(array to string( array[$1.subject $1.contents], ’ ’ )); $$;

  13. Introduction Basics Derivations Extractions Types Inheritance Thank you But there is more! • CREATE INDEX ON note using gin (tsvector(note)); • We can change the function and reindex • Calculated on write, not on read

  14. Introduction Basics Derivations Extractions Types Inheritance Thank you Fun with notation Quick: Which of these does not work? • select tsvector(note) from note; • select subject(note) from note; • select note.subject, note.tsvector from note; • select tsvector from note; Some of this is due to Postgres’s heritage before SQL.

  15. Introduction Basics Derivations Extractions Types Inheritance Thank you Exercise Exercise 1 https: //www.github.com/einhverfr/admcourse/exercises/1 (10 min to complete)

  16. Introduction Basics Derivations Extractions Types Inheritance Thank you Particular Example: Extracting Data from Structured Text

  17. Introduction Basics Derivations Extractions Types Inheritance Thank you Desired SQL We want to pull protein documents that are patents and sort based on patent date: Something like this select * from published doc d WHERE d.protein = 123 AND d.is patent ORDER BY d.patent date asc LIMIT 100

  18. Introduction Basics Derivations Extractions Types Inheritance Thank you Example Document Problem is, this information is the published docs text field. Example DB patentd REF 123ZZ12433 REGD 19750305 .... Here the important data is stored in a structured text document along with different documents of different structures.

  19. Introduction Basics Derivations Extractions Types Inheritance Thank you Example is patent function We want to make sure this is only called on these documents so we create a function which takes in a row of the appropriate type: is patent create or replace function is patent(published doc) returns bool strict immutable language sql as $$ select $1.document like ’%DB patentd%’ $$; Simple. Part 1 solved.

  20. Introduction Basics Derivations Extractions Types Inheritance Thank you Part 2: date extraction patent date function create or replace function patent date(published doc) returns date strict immutable language plperl as $$ use strict; my @lines = split / \ n /, $ [1]; my ($regd) = grep /ˆREGD \ d { 4 }\ d { 2 }\ d { 2 } / @lines; return unless $regd; $regd =˜ /ˆREGD \ d { 4 }\ d { 2 }\ d { 2 } / ; return ”$1-$2-$3”; $$; create index doc patent date on published doc(patent date(published doc));

  21. Introduction Basics Derivations Extractions Types Inheritance Thank you Exercise 2 Exercise 2 https: //www.github.com/einhverfr/admcourse/exercises/2 (10 min to complete)

  22. Introduction Basics Derivations Extractions Types Inheritance Thank you Approach here • Composite types are used for illustration purposes • Most types should be implemented in C for heavier use • Indexing is not covered.

  23. Introduction Basics Derivations Extractions Types Inheritance Thank you Use case 1: Interfaces Use Cases: • Fixed-structure complex field in table • N1NF modelling (arrays of tuples in a field) • Way of passing data into or out of a user defined function

  24. Introduction Basics Derivations Extractions Types Inheritance Thank you Example for Contact/tagged text Type definition create type full customer as ( customer customer, contacts contact item[], addresses address[] ); Each field is a tuple or an array of tuples.

  25. Introduction Basics Derivations Extractions Types Inheritance Thank you Example function Save Function create or replace function full customer save(full customer in c) returns void language sql strict as $$ insert into customer select ((in c).customer).*; insert into contact item select u.* from unnest(contacts(in c)) u; insert into address select u.* from unnest(addresses(in c)) u; $$;

  26. Introduction Basics Derivations Extractions Types Inheritance Thank you Exercise 3 Exercise 3 https: //www.github.com/einhverfr/admcourse/exercises/3 (10 min to complete)

  27. Introduction Basics Derivations Extractions Types Inheritance Thank you Use Case 2: Tables as Types • Convert a table into another type • Arithmetic on rows • Aggregates on rows

  28. Introduction Basics Derivations Extractions Types Inheritance Thank you Example: work shift Table Structure CREATE TABLE work shift ( id serial primary key, employee id int references employee(id), shift start timestamp not null, shift end timestamp not null, ... );

  29. Introduction Basics Derivations Extractions Types Inheritance Thank you Example Derivation Function CREATE OR REPLACE FUNCTION date(work shift) RETURNS date language sql immutable as $$ select $1.shift start::date; $$; CREATE OR REPLACE FUNCTION timestamp(work shift) RETURNS interval language sql immutable as $$ select $1.shift end - $1.shift start; $$; CREATE OR REPLACE FUNCTION add intervals(work shift, work shift) RETURNS interval language sql immutable as $$ select interval($1) + interval($2)$$;

  30. Introduction Basics Derivations Extractions Types Inheritance Thank you Example Aggregate CREATE OR REPLACE AGGREGATE total interval(work shift) as ( sfunc = add intervals, stype = interval, initialcond = ’0 days’ );

  31. Introduction Basics Derivations Extractions Types Inheritance Thank you Exercise 4 Exercise 4 https: //www.github.com/einhverfr/admcourse/exercises/4 (10 min to complete)

  32. Introduction Basics Derivations Extractions Types Inheritance Thank you The Polymorphic Relationship Problem

  33. Introduction Basics Derivations Extractions Types Inheritance Thank you Common example: Notes • notes in an accounting system • invoices, customers, vendors, journal entries, etc • Each of these have their own primary key spaces

  34. Introduction Basics Derivations Extractions Types Inheritance Thank you What Inheritance Gives You • a collection for all of the collection entries • referential integrity enforcement • ability to differentiate on polymorphic relationship • Consistent point of management • casts for common functions like tsvector conversions etc

Recommend


More recommend