JSONB AUDITS PRO & CONTRA WIR HABEN EINEN TRAUM DIE STAUFREIE - PowerPoint PPT Presentation

Felix Kunde and Petra Sauer JSONB AUDITS PRO & CONTRA

WIR HABEN EINEN TRAUM – DIE STAUFREIE STADT 2

Wirtschaftsatlas Berlin In the GIS scene more and more users are interested in data curation. Data creation has been too cost-intensive to just throw it away. Plus, a view into the past is desirable to measure effects of location-based decision making 3

MOTIVATION  Track all data changes in the database  Revisit previous data versions  Undo changes of certain write operations  Work against multiple branches of a database 4

HOW TO AUDIT? IN A RELATIONAL WAY id type geom 1 car POINT(1 1) AFTER 2 bike POINT(1 2) SET type = 'moto' row-level trigger 3 train POINT(2 2) id type geom from until 1 car POINT(1 1) ts1 2 bike POINT(1 2) ts1 ts2 3 train POINT(2 2) ts1 HISTORY TABLE 2 moto POINT(1 2) ts2 (SHADOW TABLE) 5

GOOD … , BUT …  Easy to setup  What to do with DDL changes?  Easy to query the past  Rely on timestamp fields?  Does not break apps  Store complete tuples? 6

Tool Method Log type Revision timetravel Audit Trail Extra Columns Timestamps temporal_tables Audit Trail Shadow Tables Timestamps table_version Audit Trail Shadow Tables UD revision TOOLS table_log Audit Trail Shadow Tables Trigger seq audit_trigger Audit Trail Generic (hstore) Transactions pgMemento Audit Trail Generic (jsonb) Transactions CyanAudit Audit Trail Generic (pivot) Transactions pgVersion Version Control Extra Columns UD revision QGIS Versioning Version Control Extra Columns UD revision GeoGig Version Control External (binary) UD revision Flyway Migration External (SQL) UD revision Liquibase Migration External (XML) UD revision FOSS4G 2017 Talk: How to version my spatial database? > http://slides.com/fxku/foss4g17_dbversion 7

WHY AUDIT IN JSONB?  Stop care about DDL changes  Can be indexed, so queries are fast  Store your JSON somewhere else  *Everybody loves JSONB!!* 8

WHAT IS DIFFERENT?  Relies on transaction IDs, not timestamps  Stores only deltas in the logs  Has a powerful undo feature 10

LOG TABLES Transaction Metadata Event Metadata Data log (12)  When using JSONB everything could be stored in one log table  pgMemento stores transaction and table event metadata in separate tables to facilitate the lookup for historic actions  Less redundancy vs. higher logging overhead 11

DML-AUDITING Surrogate Key (as PK can cover more columns) id type geom audit_id txid = 2800000 1 car POINT(1 1) 2 2 bike POINT(1 2) 23 SET type = 'moto' 3 train POINT(2 2) 42 insert BEFORE statement-level trigger id txid … id transaction_id op_id Table_operation table_relid … … … … … … … … 10 2800000 … 50 2800000 4 UPDATE 2005030 TRANSACTION_LOG TABLE_EVENT_LOG insert 12

DML-AUDITING SELECT id type geom audit_id event_id 1 car POINT(1 1) 2 FROM table_event_log 2 moto POINT(1 2) 23 WHERE 3 train POINT(2 2) 42 transaction_id = txid_current() AFTER AND table_relid = 2005030 row-level trigger AND op_id = 4; id Event_id audit_id changes … … … … 100 50 23 {"type":"bike"} insert TABLE TRANSACTION _EVENT ROW_LOG _LOG Surrogate Key _LOG as tracer 13

DDL-AUDITING id type geom audit_id txid = 2900000 1 car POINT(1 1) 2 DROP type 2 moto POINT(1 2) 23 3 train POINT(2 2) 42 insert event trigger at ddl_command_start id Event_id audit_id changes insert insert … … … … ROW_LOG 200 75 2 {"type":"car"} TABLE 201 75 23 {"type":"moto"} TRANSACTION _EVENT 202 75 42 {"type":"train"} _LOG _LOG 14

DDL-AUDITING id geom audit_id 1 POINT(1 1) 2 2 POINT(1 2) 23 3 POINT(2 2) 42 event trigger at ddl_command_end SELECT * FROM pg_event_trigger_ddl_commands() id audit_table_id column_name data_type txid_range … … … … 100 1 type text [2700000,2900000) AUDIT_TABLE_LOG AUDIT_COLUMN_LOG 15

FAST QUERIES POWERED BY GIN INDEX For which transactions column Which tuples once contained certain combinations ' type ' exists in the logs? of key(s) and value(s)? SELECT DISTINCT SELECT DISTINCT e.transaction_id audit_id FROM FROM pgmemento.table_event_log e pgmemento.row_log JOIN WHERE pgmemento.row_log r changes @> '{"type": "bike"}'::jsonb; ON r.event_id = e.id WHERE r.audit_id = 23 AND (r.changes ? 'type'); Imagine the whole tuple is stored every time. More overhead on data processing. 16

TIME FOR PAIN RESTORE PREVIOUS VERSIONS OF TUPLES 17

THE LIFE OF A TUPLE id name type Status Audit_id Time / transactions BIRTH NULL 10 GROWING UP {"status ":„ hello world"} 10 MARRIAGE {"name":"foo", "status":"alone"} 10 WISDOM {"type":"phd"} 10 DEATH {Id:1, "name":"bar", "type":"prof", "status":"happy"} 10 18

HOW WAS LIFE BEFORE MARRIAGE? id name type Status Audit_id Time / transactions BIRTH NULL 10 GROWING UP {"status ":„ hello world"} 10 Uh oh, where is the log?! MARRIAGE {"name":"foo", "status":"alone"} 10 WISDOM {"type":"phd"} 10 DEATH {Id:1, "name":"bar", "type":"prof", "status":"happy"} 10 19

STRATEGY 1: ROLLING BACK  Concat all JSONB logs in reverse order  Starting from recent state (or delete event) until requested point in time  JSONB trick: duplicate keys get overwritten  Too much overhead, if history is long 20

STRATEGY 2: JSONB QUERIES  Check audit_column_log  For each column find the first entry in logs after requested point in time  Feed result to jsonb_build_object  Produces giant queries, but still quite fast 21

RESTORE – PART 3 SELECT p.* FROM generate_log_entries(1,2800000,'my_table') entries LATERAL ( SELECT * FROM jsonb_populate_record( Works only with a template. Could be the null::my_table, actual table, but to be correct in case of entries any DDL changes, a temporary template ) can be created on the fly with information ) p; from audit_column_log. 22

WELL …  … why would you use JSONB anyway then?  Relational is faster and easier  Takes up more space on your disk  Your app might not like it 24

REVERT  Query all changes and referenced events for a given txid (or range of txids) in reverse order  Loop over result set and perform the opposite event  Consider dependencies between tables in order to avoid foreign key violations 25

REVERT op_id Event Reverse Event Log Content 1 CREATE TABLE DROP TABLE - 2 ALTER TABLE ADD COLUMN ALTER TABLE DROP COLUMN - 3 INSERT DELETE NULL 4 UPDATE UPDATE Changed fields of changed rows 5 ALTER TABLE ALTER COLUMN ALTER TABLE ALTER COLUMN All rows of altered columns 6 ALTER TABLE DROP COLUMN ALTER TABLE ADD COLUMN All rows of deleted columns 7 DELETE INSERT All fields of deleted rows 8 TRUNCATE INSERT All fields of table 9 DROP TABLE CREATE TABLE All fields of table (logged as truncate) 26

JSONB > QUERIES  Updates are pretty easy to setup (ok we could produce deltas also here and not during trigger phase)  Could lead to a branching concept …  What about long running transactions? (typical GIS workflows – many edits, commit once) 27

TO DO S  Branching concept  Log tables for more DB objects  Extending the test suite  Maybe: Logical decoding instead of triggers 28

PERFORMANCE Kunde F., Sauer P. (2017) pgMemento – A Generic Transaction-Based Audit Trail for Spatial Databases. In: Gertz M. et al. (eds) Advances in Spatial and Temporal Databases. SSTD 2017. Lecture Notes in Computer Science, vol 10411. Springer, Cham 29

TECHNICAL DETAILS  Written entirely in PL/pgSQL  Requires at least PostgreSQL 9.5  Repo: github.com/pgmemento  LGPL v3 Licence 30

Felix Kunde fkunde[at]beuth-hochschule.de QUESTIONS? Petra Sauer Sauer[at]beuth-hochschule.de Funded by:

JSONB AUDITS PRO & CONTRA WIR HABEN EINEN TRAUM DIE STAUFREIE - PowerPoint PPT Presentation

Felix Kunde and Petra Sauer JSONB AUDITS PRO & CONTRA WIR HABEN EINEN TRAUM DIE STAUFREIE STADT 2 Wirtschaftsatlas Berlin In the GIS scene more and more users are interested in data curation. Data creation has been too

Sector. The Auditor-General conducts financial audits and performance audits, and reports on the

Report of the Auditor-General 2015-16 Financial Audits Volume 4 Audits of State entities 30 June

Auditor-General conducts financial audits and performance audits, and reports on the results of

Report of the Auditor-General 2016-17 Financial Audits Volume 4 Audits of State entities 30 June

Implementing Risk-Limiting Post-Election Audits in California J.L. Hall 1 , 2 L.W. Miratrix 3 P.B.

Form 941: Compliance Challenges and Intensified IRS Audits and Intensified IRS Audits Avoiding

Balt ltimore Cit ity Fis iscal l Outlook and Audits March 5, 5, 20 2020 20 1 Biennial

Canadian Sales Tax Audits: Preparation Strategies Canadian Sales Tax Audits: Preparation Strategies

Use of Annual Audits: Overview and Trends ASBCS BOARD MEETING MAY 13, 2019 1 Audit Background

Self Assessments and Self Assessments and Environmental Audits Environmental Audits 2010

Office of Internal Audits Budget Hearing Presentation Office of Internal Audits What is

HAVA Funds Update EAC Grants/Payments & Audits Update 1. HAVA Funding Status 2015 HAVA

How to Conduct a Human Resource Risk Audit Michele Arseneau, PHR, SHRM-CP HR Consultant Course

LINK 2019 Office of University Audits Carla Jones Enterprise Wide Auditor, CIA, Ed.M.

Results of 201718 Audits: Local Government Tabled 19 December 2018 This presentation provides

PKF Audit & Conseil LUXEMBOURG HOW WE CAN HELP OUR CLIENTS AUDIT SERVICES Legal audits

Evidence-Based Elections CDAR Risk Seminar Philip B. Stark 15 September 2020 University of

Efficient Data Structures for Tamper-Evident Logging Scott A. Crosby Dan S. Wallach Rice

Prioritization of Tasks integrated Rule Oriented Data System Reagan Moore {moore, sekar, mwan,

Planning and Research Peer Engagement Group October 2020 Data-Driven Justice Johnson County,

The Battle for Principles & concepts Trust and DREs Accountable Voting Voter

CSE306 Software Quality in Practice Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall

SWALA Evaluation and the Audit Trail Emily Heard Head of Procurement, Competition and State Aid

Multi-Party Function Evaluation with Perfectly Private Audit Trail Edouard Cuvelier &

Sambuz

Useful Links

Newsletter

Mail Us

JSONB AUDITS PRO & CONTRA WIR HABEN EINEN TRAUM DIE STAUFREIE - PowerPoint PPT Presentation

Felix Kunde and Petra Sauer JSONB AUDITS PRO & CONTRA WIR HABEN EINEN TRAUM DIE STAUFREIE STADT 2 Wirtschaftsatlas Berlin In the GIS scene more and more users are interested in data curation. Data creation has been too

Sector. The Auditor-General conducts financial audits and performance audits, and reports on the

Report of the Auditor-General 2015-16 Financial Audits Volume 4 Audits of State entities 30 June

Auditor-General conducts financial audits and performance audits, and reports on the results of

Report of the Auditor-General 2016-17 Financial Audits Volume 4 Audits of State entities 30 June

Implementing Risk-Limiting Post-Election Audits in California J.L. Hall 1 , 2 L.W. Miratrix 3 P.B.

Form 941: Compliance Challenges and Intensified IRS Audits and Intensified IRS Audits Avoiding

Balt ltimore Cit ity Fis iscal l Outlook and Audits March 5, 5, 20 2020 20 1 Biennial

Canadian Sales Tax Audits: Preparation Strategies Canadian Sales Tax Audits: Preparation Strategies

Use of Annual Audits: Overview and Trends ASBCS BOARD MEETING MAY 13, 2019 1 Audit Background

Self Assessments and Self Assessments and Environmental Audits Environmental Audits 2010

Office of Internal Audits Budget Hearing Presentation Office of Internal Audits What is

HAVA Funds Update EAC Grants/Payments &amp; Audits Update 1. HAVA Funding Status 2015 HAVA

How to Conduct a Human Resource Risk Audit Michele Arseneau, PHR, SHRM-CP HR Consultant Course

LINK 2019 Office of University Audits Carla Jones Enterprise Wide Auditor, CIA, Ed.M.

Results of 201718 Audits: Local Government Tabled 19 December 2018 This presentation provides

PKF Audit &amp; Conseil LUXEMBOURG HOW WE CAN HELP OUR CLIENTS AUDIT SERVICES Legal audits

Evidence-Based Elections CDAR Risk Seminar Philip B. Stark 15 September 2020 University of

Efficient Data Structures for Tamper-Evident Logging Scott A. Crosby Dan S. Wallach Rice

Prioritization of Tasks integrated Rule Oriented Data System Reagan Moore {moore, sekar, mwan,

Planning and Research Peer Engagement Group October 2020 Data-Driven Justice Johnson County,

The Battle for Principles &amp; concepts Trust and DREs Accountable Voting Voter

CSE306 Software Quality in Practice Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall

SWALA Evaluation and the Audit Trail Emily Heard Head of Procurement, Competition and State Aid

Multi-Party Function Evaluation with Perfectly Private Audit Trail Edouard Cuvelier &amp;

Sambuz

Useful Links

Newsletter

Mail Us

HAVA Funds Update EAC Grants/Payments & Audits Update 1. HAVA Funding Status 2015 HAVA

PKF Audit & Conseil LUXEMBOURG HOW WE CAN HELP OUR CLIENTS AUDIT SERVICES Legal audits

The Battle for Principles & concepts Trust and DREs Accountable Voting Voter

Multi-Party Function Evaluation with Perfectly Private Audit Trail Edouard Cuvelier &