Virtual Mutation Analysis of Relational Database Schemas University - PowerPoint PPT Presentation

Virtual Mutation Analysis of Relational Database Schemas University of Sheffield   Phil McMinn   Allegheny College   Gregory M. Kapfhammer   University of Sheffield Chris J. Wright

Relational Databases –   Why Should We (Still) Care? A vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation

Relational Databases –   Why Should We (Still) Care? A vital component of many software systems Despite the wave of interest in “NoSQL” technologies, Relational Databases are still popular (and faster) For developers: schemas provide self-documentation Relational Databases are still   important, popular and relevant

A Relational Database Schema

A Relational Database Schema Table

A Relational Database Schema Table Column and   data type

Integrity Constraints Prevent invalid data being entered into the database Encode domain logic

Testing the Schema Correctly accepted by the schema Correctly rejected by the schema Correctly rejected by the schema

Why Do We Need to Do This?

Why Do We Need to Do This? To trap common errors when designing a schema   For example: lack of uniqueness property on usernames, out of range values

Why Do We Need to Do This? To trap common errors when designing a schema   For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment   DBMSs have subtly different behaviors

Why Do We Need to Do This? To trap common errors when designing a schema   For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment   DBMSs have subtly different behaviors Nobody throws away a database of data   To test the success of database migrations

Why Do We Need to Do This? To trap common errors when designing a schema   For example: lack of uniqueness property on usernames, out of range values To test development behaviour vs deployment   DBMSs have subtly different behaviors Nobody throws away a database of data   To test the success of database migrations Industry advice   Destroying database consistency can have huge cost implications

Mutation Analysis Once a test suite has been created, its fault finding capability can be estimated with mutation analysis. For relational database schema testing, mutants are created by making small changes to the schema

Mutation Analysis is Costly schema mutants

Mutation Analysis is Costly mutant schema mutants + test suite database Mutant killed / alive

Mutation Analysis is Costly mutant schema mutants + test suite database Mutant killed / alive SchemaAnalyst

Mutation Analysis is Costly

Mutation Analysis is Costly DO FEWER

Mutation Analysis is Costly DO FEWER DO SMARTER

Mutation Analysis is Costly DO FEWER DO SMARTER DO FASTER

Mutation Analysis is Costly mutant schema mutants + test suite database Mutant killed / alive SchemaAnalyst

Mutation Analysis is Costly mutant schema mutants + e h t h t i w g n i t a c i t n i u n m o m s e o i c r e f u o q t s L o Q test suite c S h g g n i H i t u c e x database e d n a S M B D Mutant killed / alive SchemaAnalyst

Reducing the Cost mutant schema mutants + test suite database Mutant killed / alive SchemaAnalyst

Reducing the Cost mutant schema mutants + d a e h r e v o n o i t a c test suite i n u m m o c o database n , l a c o L Mutant killed / alive SchemaAnalyst

Reducing the Cost mutant schema mutants + test suite model of database Mutant killed / alive SchemaAnalyst

Reducing the Cost Virtual mutant Mutation   Analysis schema mutants + test suite model of database Mutant killed / alive SchemaAnalyst

Reducing the Cost Virtual mutant Mutation   Analysis schema mutants + d a e h r e v o n test suite model of o i t u c e x e r e database w o L Mutant killed / alive SchemaAnalyst

The Model

The Model Integrity constraint predicate icp1

The Model Integrity constraint predicate icp1 icp2 icp3

The Model Integrity constraint predicate icp1 icp2 icp3 icp4 icp5

The Model Integrity constraint predicate icp1 icp2 icp3 icp4 icp5 Form an acceptance predicate for the table: ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5

The Model Integrity constraint predicate icp1 icp2 icp3 icp4 icp5 Form an acceptance predicate for the table: ap = icp1 /\ icp2 /\ icp3 /\ icp4 /\ icp5 True when DBMS would accept the data False otherwise

Virtual DBMS Models

Empirical Study RQ1. What is the relative efficiency of the virtual approach? RQ2. What are the time savings? RQ3. How do mutation scores compare when the standard approach is run for as long as the virtual one?

Subject Schemas

Mutation Analysis Time (Log Transformed) 1 1 1 0 0 0 1 0 1 0 1 0 C 0 0 0 0 0 0 o 0 0 0 0 0 0 f f 0 0 0 0 0 0 e e O r d e r s E m p l o y e e I n v e n t RQ1: Efficiency o r y I s o 3 J 1 W 6 6 h o M i s o S Standard z e i l r l a v P e r e r m i s s i o n N s i s t W e a t h e r P e r s o n P r o Database Schema d u c t s C o f f e e O r d e r s E m p l o y e e I n v e n t o r y I s o 3 J 1 W 6 6 h o M i s o S z e i Virtual l r l a v P e r e r m i s s i o n N s i s t W e a t h e r P e r s o n P r o d u c t s SQLite PostgreSQL HyperSQL

Mutation Analysis Time (Log Transformed) Virtual Mutation Analysis is significantly more efficient for 1 1 1 0 0 0 1 0 1 0 1 0 C 0 0 0 0 0 0 o 0 0 0 0 0 0 f f 0 0 0 0 0 0 e e O r d e r s E m p l o y e e I n v Postgres and HyperSQL, but not SQLite e n t RQ1: Efficiency o r y I s o 3 J 1 W 6 6 h o M i s o S Standard z e i l r l a v P e r e r m i s s i o n N s i s t W e a t h e r P e r s o n P r o Database Schema d u c t s C o f f e e O r d e r s E m p l o y e e I n v e n t o r y I s o 3 J 1 W 6 6 h o M i s o S z e i Virtual l r l a v P e r e r m i s s i o n N s i s t W e a t h e r P e r s o n P r o d u c t s SQLite PostgreSQL HyperSQL

RQ2: Time Savings HyperSQL PostgreSQL SQLite 100 Percentage of Mean Time Saved 50 0 − 50 − 100 50 100 150 Number of Mutants

RQ2: Time Savings HyperSQL PostgreSQL SQLite 100 Virtual Mutation Analysis yields large time savings for Percentage of Mean Time Saved Postgres and HyperSQL but not always with SQLite, 50 leading to an average time saving of 51% overall 0 − 50 − 100 50 100 150 Number of Mutants

Total Number of Mutants C o 1 1 f f e 5 0 5 e 0 0 0 0 O r d e E r m s p l o y e I n e v e n t o r y I s J RQ3: Comparison o W 3 M h 1 o o 6 z 6 i i s l HyperSQL l S a P e r e v r e m r i s N s i i o s n t W s e a t h e r P e r s o P n r o d u C c t o s f f e e O r d e E r m s p l o y e I n e v e n t o r y I s J o Database Schema W 3 M h 1 Selective o o 6 z PostgreSQL 6 i i s l l S a P e r e v r e m r i s N s i i o s n t W s e a t h e Virtual r P e r s o P n r o d u C c t o s f f e e O r d e E r m s p l o y e I n e v e n t o r y I s J o W M 3 h 1 o o 6 z 6 i i s l l S a P e SQLite r e v r e m r i s N s i i o s n t W s e a t h e r P e r s o P n r o d u c t s

Virtual Mutation Analysis of Relational Database Schemas University - PowerPoint PPT Presentation

Virtual Mutation Analysis of Relational Database Schemas University of Sheffield Phil McMinn Allegheny College Gregory M. Kapfhammer University of Sheffield Chris J. Wright Relational Databases Why Should We (Still) Care?

Web Services Web Services XML Schemas XML Schemas XML Schemas Whenever DTDs are not enough

DOMINO: Fast and Effective Test Data Generation for Relational Database Schemas Abdullah

Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple

Chapter 2: Relational Model Chapter 2: Relational Model Structure of Relational Databases

What about larger-scale representations? Challenges for traditional theories of schemas How to

1 Mutation pressure Let = the mutation rate from A a Let = the mutation rate from a

1 Mutation pressure Let = the mutation rate from A a Let = the mutation rate from a

Relational Algebra Relational Query Languages Recall: Query = Retrieval Program Language

Relational Algebra 1 / 39 Relational Algebra Relational model specifies stuctures and

Relational Query Languages (2) SQL and QBE Walid G. Aref Query Languages For The Relational

Chapter 8 Evaluation of Relational Operators Implementing the Relational Algebra Relational

This Lecture The Relational Model Relational data structures Relations and Relational

Database Concepts 1 / 16 Database Concepts Data models, schemas, instances Three-schema

Drift, mutation, selection, and the evolutionary dispersion of mean phenotypes recombination

CSE 154 LECTURE 13:RELATIONAL DATABASES AND SQL Relational databases relational database : A

CSC 337 LECTURE 20: RELATIONAL DATABASES AND SQL Relational databases relational database : A

ASSESSMENT 3 BRIEF Subject Code and Title MKT600 Marketing Assessment 3: Presentation on S u s ta

A case for safety : A Regulator's View Presentation to IChemE Safety Group Jane Cutler Chief

CFTC Rescinds Widely Used Private Fund February 16, 2012 Manager Exemption from Commodity Pool

Monitoring Device Andrew Jarrett Project Advisor: Professor Gutschlag Department of Electrical

Triple-S conference: What Now? 21 st September 2011 Help Ive lost my grid! How to

Outline 1 Motivation 2 The DL DL - Lite ( HN ) horn 3 Knowledge Base Satisfiability 4 Query

Project Plan Dealer Improvement Recommender System The Capstone Experience Team Urban Science

JEDEX 2003 Memory Futures Track 2 March 25, 2003 High Speed Digital Systems Require