range consistent answers of aggregate queries under
play

Range-Consistent Answers of Aggregate Queries under Aggregate - PowerPoint PPT Presentation

Introduction Preliminaries Query Answering Conclusion and Future Work Range-Consistent Answers of Aggregate Queries under Aggregate Constraints Sergio Flesca, Filippo Furfaro, Francesco Parisi DEIS University of Calabria 87036 Rende (CS),


  1. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Outline Introduction 1 Motivation Contribution Preliminaries 2 Aggregate Constraints Repairs Aggregate Queries Query Answering 3 Steady Aggregate Constraints Computing Range-Consistent Answers Experimental Results Conclusion and Future Work 4 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 6 / 34

  2. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Managing data consistency Often classical “classical” integrity constraints (keys, foreign keys, FDs) do not suffice to manage data consistency in scientific and statistical databases, data warehouses, numerical values in some tuples result from aggregating values in other tuples in the balance sheet example, the sum of cash sales and receivables should be equal to the total cash receipts digitized document Receipts cash sales 100 (e.g. obtained by an OCR tool) receivables 120 total receipts 250 Aggregate constraints allow us to define algebraic relations among aggregate values extracted from the database Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 7 / 34

  3. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Managing data consistency Often classical “classical” integrity constraints (keys, foreign keys, FDs) do not suffice to manage data consistency in scientific and statistical databases, data warehouses, numerical values in some tuples result from aggregating values in other tuples in the balance sheet example, the sum of cash sales and receivables should be equal to the total cash receipts digitized document Receipts cash sales 100 (e.g. obtained by an OCR tool) receivables 120 total receipts 250 Aggregate constraints allow us to define algebraic relations among aggregate values extracted from the database Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 7 / 34

  4. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Managing data consistency Often classical “classical” integrity constraints (keys, foreign keys, FDs) do not suffice to manage data consistency in scientific and statistical databases, data warehouses, numerical values in some tuples result from aggregating values in other tuples in the balance sheet example, the sum of cash sales and receivables should be equal to the total cash receipts digitized document Receipts cash sales 100 (e.g. obtained by an OCR tool) receivables 120 total receipts 250 Aggregate constraints allow us to define algebraic relations among aggregate values extracted from the database Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 7 / 34

  5. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Aggregate Constraints Definition (Aggregate Constraint) An aggregate constraint on a database scheme D is of the form ⇒ � n ∀ � � φ ( � i = 1 c i · χ i ( � � x x ) = y i ) ≤ K c 1 , . . . , c n , K are rational constants; 1 φ ( � x ) is a conjunction of atoms constructed from relation names, 2 constants, and all the variables in � x ; each χ i ( � y i ) is an aggregation function, where � y i is a list of variables 3 and constants, and every variable that occurs in � y i also occurs in � x . The aggregation function χ ( � y ) = � R , e , α ( � y ) � corresponds to the SQL query SELECT SUM (e) FROM R WHERE α ( � y ) , where e is an attribute of R or a constant Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 8 / 34

  6. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Aggregate Constraints Definition (Aggregate Constraint) An aggregate constraint on a database scheme D is of the form ⇒ � n ∀ � � φ ( � i = 1 c i · χ i ( � � x x ) = y i ) ≤ K c 1 , . . . , c n , K are rational constants; 1 φ ( � x ) is a conjunction of atoms constructed from relation names, 2 constants, and all the variables in � x ; each χ i ( � y i ) is an aggregation function, where � y i is a list of variables 3 and constants, and every variable that occurs in � y i also occurs in � x . The aggregation function χ ( � y ) = � R , e , α ( � y ) � corresponds to the SQL query SELECT SUM (e) FROM R WHERE α ( � y ) , where e is an attribute of R or a constant Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 8 / 34

  7. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Example of Aggregate Constraint BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 1 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , x 2 , ‘det’ ) = χ 1 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 9 / 34

  8. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Example of Aggregate Constraint BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 1 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , x 2 , ‘det’ ) = χ 1 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 9 / 34

  9. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Example of Aggregate Constraint BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 1 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , x 2 , ‘det’ ) = χ 1 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 9 / 34

  10. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Repairing strategy (1/2) A repair for a database w.r.t. a set of aggregate constraints is a set of value updates making the database consistent Updates regard attributes representing measure values, such as weights, lengths, prices, etc. We call these attributes measure attributes We assume that the absolute values of measure attributes are bounded by a constant M . It is often possible to pre-determine a specific range for numerical attributes. In the balance sheet context, it can be reasonably assumed that the items are bounded by $ 10 9 . Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 10 / 34

  11. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Repairing strategy (1/2) A repair for a database w.r.t. a set of aggregate constraints is a set of value updates making the database consistent Updates regard attributes representing measure values, such as weights, lengths, prices, etc. We call these attributes measure attributes We assume that the absolute values of measure attributes are bounded by a constant M . It is often possible to pre-determine a specific range for numerical attributes. In the balance sheet context, it can be reasonably assumed that the items are bounded by $ 10 9 . Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 10 / 34

  12. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Repairing strategy (2/2) Reasonable repairs, called card -minimal repairs, are those having minimum cardinality Repairing by card -minimal repairs means assuming that the minimum number of errors occurred In the balance-sheet context: the most probable case is that the acquiring system made the minimum number of errors Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 11 / 34

  13. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Repairing strategy (2/2) Reasonable repairs, called card -minimal repairs, are those having minimum cardinality Repairing by card -minimal repairs means assuming that the minimum number of errors occurred In the balance-sheet context: the most probable case is that the acquiring system made the minimum number of errors Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 11 / 34

  14. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 12 / 34

  15. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 12 / 34

  16. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 12 / 34

  17. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 12 / 34

  18. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Aggregate Queries Definition (Aggregate Query) An aggregate query on a database scheme D is an expression of the form SELECT f FROM R WHERE α , where: R is a relation scheme in D ; 1 f is one of MIN ( A ) , MAX ( A ) or SUM ( A ) , where A in an attribute of R ; 2 α is boolean combination of atomic comparisons of the form X ⋄ Y , 3 where X and Y are constants or non-measure attributes of R , and ⋄ ∈ { = , � = , ≤ , ≥ , <, > } . Our transformation for computing CQAs by solving ILP instances exploits the restriction that no measure attribute occurs in the WHERE clause of an aggregate query Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 13 / 34

  19. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Aggregate Queries Definition (Aggregate Query) An aggregate query on a database scheme D is an expression of the form SELECT f FROM R WHERE α , where: R is a relation scheme in D ; 1 f is one of MIN ( A ) , MAX ( A ) or SUM ( A ) , where A in an attribute of R ; 2 α is boolean combination of atomic comparisons of the form X ⋄ Y , 3 where X and Y are constants or non-measure attributes of R , and ⋄ ∈ { = , � = , ≤ , ≥ , <, > } . Our transformation for computing CQAs by solving ILP instances exploits the restriction that no measure attribute occurs in the WHERE clause of an aggregate query Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 13 / 34

  20. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Range Consistent Answers Let D be a database scheme, AC a set of aggregate constraints on D , q an aggregate query on D , and D an instance of D . Definition (Range-consistent query answer) The range-consistent query answer of q on D is the empty interval ∅ , in the case that D admits no repair w.r.t. AC , or the interval [ glb , lub ] , otherwise, where: i) for each card -minimal repair ρ for D w.r.t. AC , it holds that glb ≤ q ( ρ ( D )) ≤ lub ; ii) there is a pair ρ ′ , ρ ′′ of card -minimal repairs for D w.r.t. AC such that q ( ρ ′ ( D )) = glb and q ( ρ ′′ ( D )) = lub . Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 14 / 34

  21. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Range Consistent Answers Let D be a database scheme, AC a set of aggregate constraints on D , q an aggregate query on D , and D an instance of D . Definition (Range-consistent query answer) The range-consistent query answer of q on D is the empty interval ∅ , in the case that D admits no repair w.r.t. AC , or the interval [ glb , lub ] , otherwise, where: i) for each card -minimal repair ρ for D w.r.t. AC , it holds that glb ≤ q ( ρ ( D )) ≤ lub ; ii) there is a pair ρ ′ , ρ ′′ of card -minimal repairs for D w.r.t. AC such that q ( ρ ′ ( D )) = glb and q ( ρ ′′ ( D )) = lub . Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 14 / 34

  22. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Range Consistent Answers - Example BalanceSheets Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 The range-CQA of SELECT MAX ( Value ) FROM BalanceSheets WHERE Subsection = ‘cash sales’ is [ 100 , 130 ] The range-CQA of SELECT MAX ( Value ) FROM BalanceSheets WHERE Subsection = ‘net cash inflow’ is [ 30 , 30 ] Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 15 / 34

  23. Introduction Aggregate Constraints Preliminaries Repairs Query Answering Aggregate Queries Conclusion and Future Work Range Consistent Answers - Example BalanceSheets Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 The range-CQA of SELECT MAX ( Value ) FROM BalanceSheets WHERE Subsection = ‘cash sales’ is [ 100 , 130 ] The range-CQA of SELECT MAX ( Value ) FROM BalanceSheets WHERE Subsection = ‘net cash inflow’ is [ 30 , 30 ] Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 15 / 34

  24. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Outline Introduction 1 Motivation Contribution Preliminaries 2 Aggregate Constraints Repairs Aggregate Queries Query Answering 3 Steady Aggregate Constraints Computing Range-Consistent Answers Experimental Results Conclusion and Future Work 4 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 16 / 34

  25. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregate Constraints Our approach for computing consistent answers exploits a restrictions imposed on aggregate constraints Definition (Steady aggregate constraint) ⇒ � n � � Aggregate constraint ∀ � x φ ( � x ) = i = 1 c i · χ i ( � y i ) ≤ K is steady if: for each χ i = � R i , e i , α i � , no measure attribute occurs in α i 1 measure variables occur at most once in the aggregate constraint 2 no constant occurring in φ is associated with a measure attribute 3 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 17 / 34

  26. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregate Constraints Our approach for computing consistent answers exploits a restrictions imposed on aggregate constraints Definition (Steady aggregate constraint) ⇒ � n � � Aggregate constraint ∀ � x φ ( � x ) = i = 1 c i · χ i ( � y i ) ≤ K is steady if: for each χ i = � R i , e i , α i � , no measure attribute occurs in α i 1 measure variables occur at most once in the aggregate constraint 2 no constant occurring in φ is associated with a measure attribute 3 - attribute Value is the measure attribute of BalanceSheets ( Year , Section , Subsection , Type , Value ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 17 / 34

  27. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregate Constraints Our approach for computing consistent answers exploits a restrictions imposed on aggregate constraints Definition (Steady aggregate constraint) ⇒ � n � � Aggregate constraint ∀ � x φ ( � x ) = i = 1 c i · χ i ( � y i ) ≤ K is steady if: for each χ i = � R i , e i , α i � , no measure attribute occurs in α i 1 measure variables occur at most once in the aggregate constraint 2 no constant occurring in φ is associated with a measure attribute 3 - measure variables are those variables occurring at the position of a measure attribute in φ - x 5 is the measure variable for φ = BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) , as it occur at the position of Value Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 17 / 34

  28. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregate Constraints Our approach for computing consistent answers exploits a restrictions imposed on aggregate constraints Definition (Steady aggregate constraint) ⇒ � n � � Aggregate constraint ∀ � x φ ( � x ) = i = 1 c i · χ i ( � y i ) ≤ K is steady if: for each χ i = � R i , e i , α i � , no measure attribute occurs in α i 1 measure variables occur at most once in the aggregate constraint 2 no constant occurring in φ is associated with a measure attribute 3 - a constant in φ is associated with a measure attribute if it occurs at the position of a measure attribute in φ - for φ = BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) , x 5 cannot be a constant Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 17 / 34

  29. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Complexity Results Steady aggregate constraints are expressive enough to ensure data consistency in several real-life scenarios The range-CQA problem is hard (even if aggregate constraints are steady) Theorem (Complexity of Range-CQA) Let D be a fixed database scheme, AC a fixed set of aggregate constraints on D , q a fixed aggregate query on D , D an instance of D , and [ ℓ, u ] a fixed interval. Deciding whether CQA q D , AC ( D ) � = ∅ is NP-complete 1 Deciding whether CQA q D , AC ( D ) ⊆ [ ℓ, u ] is ∆ p 2 [ log n ] -complete 2 The lower complexity bounds still hold in the case that AC is steady 3 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 18 / 34

  30. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Complexity Results Steady aggregate constraints are expressive enough to ensure data consistency in several real-life scenarios The range-CQA problem is hard (even if aggregate constraints are steady) Theorem (Complexity of Range-CQA) Let D be a fixed database scheme, AC a fixed set of aggregate constraints on D , q a fixed aggregate query on D , D an instance of D , and [ ℓ, u ] a fixed interval. Deciding whether CQA q D , AC ( D ) � = ∅ is NP-complete 1 Deciding whether CQA q D , AC ( D ) ⊆ [ ℓ, u ] is ∆ p 2 [ log n ] -complete 2 The lower complexity bounds still hold in the case that AC is steady 3 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 18 / 34

  31. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Basic Steps Our approach for computing range-consistent answers w.r.t. steady aggregate constraints consists of two steps: we compute the cardinality of card -minimal repairs by solving an 1 ILP instance starting from the knowledge of this cardinality, a pair of ILP 2 instances are solved for computing the greatest-lower bound and the least-upper bound of the answers Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 19 / 34

  32. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregation Expressions as Inequalities (1/2) A set of steady aggregate constraints AC on a database scheme D and an instance D of D can be translated into a set of linear inequalities S ( D , AC , D ) Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 → z 1 2008 Receipts cash sales det 100 → z 2 2008 Receipts receivables det 120 → z 3 2008 Receipts total cash receipts aggr 250 → z 4 2008 Disburs. payment of accounts det 120 → z 5 2008 Disburs. capital expenditure det 20 → z 6 2008 Disburs. long-term financing det 80 → z 7 2008 Disburs. total disbursements aggr 220 → z 8 → z 9 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 → z 10 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 20 / 34

  33. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregation Expressions as Inequalities (1/2) A set of steady aggregate constraints AC on a database scheme D and an instance D of D can be translated into a set of linear inequalities S ( D , AC , D ) Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 → z 1 2008 Receipts cash sales det 100 → z 2 2008 Receipts receivables det 120 → z 3 2008 Receipts total cash receipts aggr 250 → z 4 2008 Disburs. payment of accounts det 120 → z 5 2008 Disburs. capital expenditure det 20 → z 6 2008 Disburs. long-term financing det 80 → z 7 2008 Disburs. total disbursements aggr 220 → z 8 → z 9 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 → z 10 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 20 / 34

  34. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregation Expressions as Inequalities (1/2) A set of steady aggregate constraints AC on a database scheme D and an instance D of D can be translated into a set of linear inequalities S ( D , AC , D ) Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 → z 1 2008 Receipts cash sales det 100 → z 2 2008 Receipts receivables det 120 → z 3 2008 Receipts total cash receipts aggr 250 → z 4 � z 2 + z 3 = z 4 2008 Disburs. payment of accounts det 120 → z 5 z 5 + z 6 + z 7 = z 8 2008 Disburs. capital expenditure det 20 → z 6 2008 Disburs. long-term financing det 80 → z 7 2008 Disburs. total disbursements aggr 220 → z 8 → z 9 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 → z 10 κ 1 : BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , x 2 , det )= χ 1 ( x 1 , x 2 , aggr ) where χ 1 ( x , y , z )= � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 20 / 34

  35. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Steady Aggregation Expressions as Inequalities (2/2) Every solution of S ( D , AC , D ) corresponds to a (possibly not minimal, not M -bounded) repair for D w.r.t. AC Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 z 1 2008 Receipts cash sales det 100 z 2 S ( D , { κ 1 , κ 2 , κ 3 } , D ) : 2008 Receipts receivables det 120 z 3  z 4 − z 8 = z 9 2008 Receipts total cash receipts aggr 250 z 4   z 1 + z 9 = z 10 2008 Disburs. payment of accounts det 120 z 5  z 2 + z 3 = z 4 2008 Disburs. capital expenditure det 20 z 6   z 5 + z 6 + z 7 = z 8 2008 Disburs. long-term financing det 80 z 7  2008 Disburs. total disbursements aggr 220 z 8 2008 Balance net cash inflow drv 30 z 9 2008 Balance ending cash balance drv 80 z 10 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 21 / 34

  36. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Basic ILP Definition ( ILP ( D , AC , D ) ) Given a database scheme D , a set AC of steady aggregate constraints on D , and an instance D of D , ILP ( D , AC , D ) is:  A × � z ≤ B   z i − M ≤ 0 − z i − M ≤ 0  z i − v i − ( M + | v i | ) · δ i ≤ 0 − z i + v i − ( M + | v i | ) · δ i ≤ 0   z i ∈ Z δ i ∈ { 0 , 1 }  Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 22 / 34

  37. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Basic ILP Definition ( ILP ( D , AC , D ) ) Given a database scheme D , a set AC of steady aggregate constraints on D , and an instance D of D , ILP ( D , AC , D ) is:  A × � z ≤ B   z i − M ≤ 0 − z i − M ≤ 0  z i − v i − ( M + | v i | ) · δ i ≤ 0 − z i + v i − ( M + | v i | ) · δ i ≤ 0   z i ∈ Z δ i ∈ { 0 , 1 }  A × � z ≤ B is the set of inequalities S ( D , AC , D ) M bounds the absolute value of measure attributes v i is the database value corresponding to the variable z i · · · · · · · · · · · · · · · 2008 Receipts beginning cash drv 50 → z 1 v 1 = 50 · · · · · · · · · · · · · · · Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 22 / 34

  38. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Basic ILP Definition ( ILP ( D , AC , D ) ) Given a database scheme D , a set AC of steady aggregate constraints on D , and an instance D of D , ILP ( D , AC , D ) is:  A × � z ≤ B   z i − M ≤ 0 − z i − M ≤ 0  z i − v i − ( M + | v i | ) · δ i ≤ 0 − z i + v i − ( M + | v i | ) · δ i ≤ 0   z i ∈ Z δ i ∈ { 0 , 1 }  We defined mechanism for counting the number of updates: if z i � = v i , then δ i = 1 � δ i is an upper bound on the number of updates performed by the repair corresponding to the solution of ILP ( D , AC , D ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 22 / 34

  39. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Computing Repairs Theorem (Repairs ) There is a biunique correspondence between the solutions of ILP ( D , AC , D ) and the repairs for D w.r.t AC . In particular, every solution s of ILP ( D , AC , D ) corresponds to a repair ρ ( s ) such that the cardinality of ρ ( s ) is less than or equal to � δ i . The range-CQA is the empty interval if there is no repair Corollary (Empty Range-CQA) CQA q D , AC ( D ) = ∅ iff ILP ( D , AC , D ) has no solution. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 23 / 34

  40. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Computing Repairs Theorem (Repairs ) There is a biunique correspondence between the solutions of ILP ( D , AC , D ) and the repairs for D w.r.t AC . In particular, every solution s of ILP ( D , AC , D ) corresponds to a repair ρ ( s ) such that the cardinality of ρ ( s ) is less than or equal to � δ i . The range-CQA is the empty interval if there is no repair Corollary (Empty Range-CQA) CQA q D , AC ( D ) = ∅ iff ILP ( D , AC , D ) has no solution. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 23 / 34

  41. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Computing the Minimum Cardinality of Repairs OPT ( D , AC , D ) := minimize � i δ i subject to ILP ( D , AC , D ) Corollary (Cardinality of Card -minimal repairs) The optimal value of OPT ( D , AC , D ) coincides with the cardinality of any card-minimal repair for D w.r.t. AC . The solution of OPT ( D , AC , D ) is exploited to compute (not empty) range-consistent answers Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 24 / 34

  42. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work SUM -queries (1/2) Let λ be the cardinality of any card -minimal repair. The solutions of � ILP ( D , AC , D ) λ = � δ i one-to-one correspond to card-minimal repairs for D w.r.t. AC For q = SELECT SUM ( A ) FROM R WHERE α we define T ( q ) as � = α z t , A , t : t ∈ R ∧ t | i.e., the sum of variables z associated with tuples of R satisfying the WHERE condition � ILP ( D , AC , D ) λ = � δ i minimizing (resp. maximizing) T ( q ) subject to result in the minimum (resp. maximum) value of q on all the databases resulting from applying card -minimal repairs Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 25 / 34

  43. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work SUM -queries (1/2) Let λ be the cardinality of any card -minimal repair. The solutions of � ILP ( D , AC , D ) λ = � δ i one-to-one correspond to card-minimal repairs for D w.r.t. AC For q = SELECT SUM ( A ) FROM R WHERE α we define T ( q ) as � = α z t , A , t : t ∈ R ∧ t | i.e., the sum of variables z associated with tuples of R satisfying the WHERE condition � ILP ( D , AC , D ) λ = � δ i minimizing (resp. maximizing) T ( q ) subject to result in the minimum (resp. maximum) value of q on all the databases resulting from applying card -minimal repairs Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 25 / 34

  44. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work SUM -queries (1/2) Let λ be the cardinality of any card -minimal repair. The solutions of � ILP ( D , AC , D ) λ = � δ i one-to-one correspond to card-minimal repairs for D w.r.t. AC For q = SELECT SUM ( A ) FROM R WHERE α we define T ( q ) as � = α z t , A , t : t ∈ R ∧ t | i.e., the sum of variables z associated with tuples of R satisfying the WHERE condition � ILP ( D , AC , D ) λ = � δ i minimizing (resp. maximizing) T ( q ) subject to result in the minimum (resp. maximum) value of q on all the databases resulting from applying card -minimal repairs Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 25 / 34

  45. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work SUM -queries (2/2) greatest-lower bound least-upper bound OPT SUM OPT SUM glb ( D , AC , q , D ) := lub ( D , AC , q , D ) := minimize T ( q ) subject to maximize T ( q ) subject to � � ILP ( D , AC , D ) ILP ( D , AC , D ) λ = � δ i λ = � δ i Theorem (Range-Consistent Answer of SUM -query ) For a SUM -query q, either CQA q D , AC ( D ) = ∅ , or CQA q D , AC ( D ) = [ ℓ, u ] , where ℓ is the value returned by OPT SUM glb ( D , AC , q , D ) 1 u the value returned by OPT SUM lub ( D , AC , q , D ) . 2 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 26 / 34

  46. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (1/2) Additional inequalities are exploited to encode the MAX function Let I ( q ) be the set of indexes of variables z associated with the tuples selected by MAX -query q , we define In ( q ) as  z j − z i − 2 M · µ i ≤ 0 ∀ j , i ∈ I ( q ) , j � = i   � i ∈I ( q ) µ i = |I ( q ) | − 1     x i − M · µ i ≤ 0 ; − x i − M · µ i ≤ 0 ;  z i − x i − 2 M · ( 1 − µ i ) ≤ 0 ; − z i + x i − 2 M · ( 1 − µ i ) ≤ 0 ;   x i − M ≤ 0 ; − x i − M ≤ 0 ;     x i ∈ Z ; µ i ∈ { 0 , 1 } ; ∀ i ∈ I ( q );  � z i if z i takes the maximum value among variables z j z i − x i = 0 otherwise Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 27 / 34

  47. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (1/2) Additional inequalities are exploited to encode the MAX function Let I ( q ) be the set of indexes of variables z associated with the tuples selected by MAX -query q , we define In ( q ) as  z j − z i − 2 M · µ i ≤ 0 ∀ j , i ∈ I ( q ) , j � = i   � i ∈I ( q ) µ i = |I ( q ) | − 1     x i − M · µ i ≤ 0 ; − x i − M · µ i ≤ 0 ;  z i − x i − 2 M · ( 1 − µ i ) ≤ 0 ; − z i + x i − 2 M · ( 1 − µ i ) ≤ 0 ;   x i − M ≤ 0 ; − x i − M ≤ 0 ;     x i ∈ Z ; µ i ∈ { 0 , 1 } ; ∀ i ∈ I ( q );  � z i if z i takes the maximum value among variables z j z i − x i = 0 otherwise Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 27 / 34

  48. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (1/2) Additional inequalities are exploited to encode the MAX function Let I ( q ) be the set of indexes of variables z associated with the tuples selected by MAX -query q , we define In ( q ) as  z j − z i − 2 M · µ i ≤ 0 ∀ j , i ∈ I ( q ) , j � = i   � i ∈I ( q ) µ i = |I ( q ) | − 1     x i − M · µ i ≤ 0 ; − x i − M · µ i ≤ 0 ;  z i − x i − 2 M · ( 1 − µ i ) ≤ 0 ; − z i + x i − 2 M · ( 1 − µ i ) ≤ 0 ;   x i − M ≤ 0 ; − x i − M ≤ 0 ;     x i ∈ Z ; µ i ∈ { 0 , 1 } ; ∀ i ∈ I ( q );  � z i if z i takes the maximum value among variables z j z i − x i = 0 otherwise Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 27 / 34

  49. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (2/2) OPT MAX OPT MAX glb ( D , AC , q , D ) := lub ( D , AC , q , D ) := minimize � maximize � i ∈I ( q ) ( z i − x i ) i ∈I ( q ) ( z i − x i ) subject to subject to   ILP ( D , AC , D ) ILP ( D , AC , D ) λ = � δ i λ = � δ i   In ( q ) In ( q )   Theorem (Range-Consistent Answer of MAX -query ) For a MAX -query q, either CQA q D , AC ( D ) = ∅ , or CQA q D , AC ( D ) = [ ℓ, u ] ℓ is the value returned by OPT MAX glb ( D , AC , q , D ) 1 u the value returned by OPT MAX lub ( D , AC , q , D ) . 2 A similar (symmetric) result holds for MIN -queries Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 28 / 34

  50. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (2/2) OPT MAX OPT MAX glb ( D , AC , q , D ) := lub ( D , AC , q , D ) := minimize � maximize � i ∈I ( q ) ( z i − x i ) i ∈I ( q ) ( z i − x i ) subject to subject to   ILP ( D , AC , D ) ILP ( D , AC , D ) λ = � δ i λ = � δ i   In ( q ) In ( q )   Theorem (Range-Consistent Answer of MAX -query ) For a MAX -query q, either CQA q D , AC ( D ) = ∅ , or CQA q D , AC ( D ) = [ ℓ, u ] ℓ is the value returned by OPT MAX glb ( D , AC , q , D ) 1 u the value returned by OPT MAX lub ( D , AC , q , D ) . 2 A similar (symmetric) result holds for MIN -queries Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 28 / 34

  51. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work MAX -queries (2/2) OPT MAX OPT MAX glb ( D , AC , q , D ) := lub ( D , AC , q , D ) := minimize � maximize � i ∈I ( q ) ( z i − x i ) i ∈I ( q ) ( z i − x i ) subject to subject to   ILP ( D , AC , D ) ILP ( D , AC , D ) λ = � δ i λ = � δ i   In ( q ) In ( q )   Theorem (Range-Consistent Answer of MAX -query ) For a MAX -query q, either CQA q D , AC ( D ) = ∅ , or CQA q D , AC ( D ) = [ ℓ, u ] ℓ is the value returned by OPT MAX glb ( D , AC , q , D ) 1 u the value returned by OPT MAX lub ( D , AC , q , D ) . 2 A similar (symmetric) result holds for MIN -queries Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 28 / 34

  52. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Experiment 1 on data set Balance Sheets Average time needed for computing range-consistent answers vs. the percentage of erroneous values 30 C1, SUM C1, MAX 25 C1, MIN 3 years balance C2, SUM sheets of companies 20 C2, MAX C 1 , C 2 , C 3 C2, MIN sec. containing 346, 780, 15 C3, SUM and 1234 tuples, C3, MAX respectively 10 C3, MIN typically the percentage of errors 5 is less than 5 % of acquired data 0 1 2 3 4 5 6 7 8 9 10 # errors / # items (%) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 29 / 34

  53. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Experiment 1 on data set Balance Sheets Average time needed for computing range-consistent answers vs. the percentage of erroneous values 30 C1, SUM C1, MAX 25 C1, MIN 3 years balance C2, SUM sheets of companies 20 C2, MAX C 1 , C 2 , C 3 C2, MIN sec. containing 346, 780, 15 C3, SUM and 1234 tuples, C3, MAX respectively 10 C3, MIN typically the percentage of errors 5 is less than 5 % of acquired data 0 1 2 3 4 5 6 7 8 9 10 # errors / # items (%) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 29 / 34

  54. Introduction Steady Aggregate Constraints Preliminaries Computing Range-Consistent Answers Query Answering Experimental Results Conclusion and Future Work Experiment 2 on data set Balance Sheets An insight on the impact of the database size on the performance of our technique (5 % of erroneous values) 8 C1, SUM 7 C1, MAX C1, MIN 6 C2, SUM C2, MAX 5 every 1-year balance C2, MIN sec. sheet of companies 4 C3, SUM C 1 , C 2 , C 3 contains C3, MAX 3 about 115, 260, and C3, MIN 410 tuples, 2 respectively 1 0 1 2 3 4 5 Number of years Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 30 / 34

  55. Introduction Preliminaries Query Answering Conclusion and Future Work Outline Introduction 1 Motivation Contribution Preliminaries 2 Aggregate Constraints Repairs Aggregate Queries Query Answering 3 Steady Aggregate Constraints Computing Range-Consistent Answers Experimental Results Conclusion and Future Work 4 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 31 / 34

  56. Introduction Preliminaries Query Answering Conclusion and Future Work Conclusion and ... We have introduced a framework for computing range-consistent answers of MAX -, MIN -, and SUM -queries in numerical databases violating a given set of aggregate constraints Our approach exploits a transformation into integer linear programming (ILP), thus allowing us to exploit well-known techniques for solving ILP problems Experimental results prove the feasibility of the proposed approach in real-life application scenarios Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 32 / 34

  57. Introduction Preliminaries Query Answering Conclusion and Future Work ... Future Work Further work will be devoted to devising strategies for computing range-consistent answers of other form of queries (e.g. AVG , GROUPBY clause,...) devising strategies for improving performance of our technique (e.g., reducing the number of variables and inequalities used) devising a transformation for non-steady constraints (and queries with WHERE clause containing also measure attributes) remove the assumption that measure attributes are bounded in value (range-consistent answers can be ±∞ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 33 / 34

  58. Introduction Preliminaries Query Answering Conclusion and Future Work Thank you! ... any question? Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 34 / 34

  59. Backup Slides Appendix For Further Reading I Related Work The range-consistent query answer semantics was introduced in [Arenas et Al (TCS 2003)], as a more specific notion of consistent answer w.r.t. the original definition of [Arenas et Al (PODS 1999)] for dealing with aggregate queries (in the presence of FDs) Range-CQAs were further investigated in [Fuxman et Al (SIGMOD 2005)] for aggregate queries with grouping under key constraints [Flesca et Al (TODS 2010)] investigated several problems regarding the extraction of reliable information from data violating aggregate constraints (including CQA for atomic ground queries) None of these works investigated range-CQAa to aggregate queries under of aggregate constraints. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 35 / 34

  60. Backup Slides Appendix For Further Reading I Related Work The range-consistent query answer semantics was introduced in [Arenas et Al (TCS 2003)], as a more specific notion of consistent answer w.r.t. the original definition of [Arenas et Al (PODS 1999)] for dealing with aggregate queries (in the presence of FDs) Range-CQAs were further investigated in [Fuxman et Al (SIGMOD 2005)] for aggregate queries with grouping under key constraints [Flesca et Al (TODS 2010)] investigated several problems regarding the extraction of reliable information from data violating aggregate constraints (including CQA for atomic ground queries) None of these works investigated range-CQAa to aggregate queries under of aggregate constraints. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 35 / 34

  61. Backup Slides Appendix For Further Reading I Related Work The range-consistent query answer semantics was introduced in [Arenas et Al (TCS 2003)], as a more specific notion of consistent answer w.r.t. the original definition of [Arenas et Al (PODS 1999)] for dealing with aggregate queries (in the presence of FDs) Range-CQAs were further investigated in [Fuxman et Al (SIGMOD 2005)] for aggregate queries with grouping under key constraints [Flesca et Al (TODS 2010)] investigated several problems regarding the extraction of reliable information from data violating aggregate constraints (including CQA for atomic ground queries) None of these works investigated range-CQAa to aggregate queries under of aggregate constraints. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 35 / 34

  62. Backup Slides Appendix For Further Reading I Related Work The range-consistent query answer semantics was introduced in [Arenas et Al (TCS 2003)], as a more specific notion of consistent answer w.r.t. the original definition of [Arenas et Al (PODS 1999)] for dealing with aggregate queries (in the presence of FDs) Range-CQAs were further investigated in [Fuxman et Al (SIGMOD 2005)] for aggregate queries with grouping under key constraints [Flesca et Al (TODS 2010)] investigated several problems regarding the extraction of reliable information from data violating aggregate constraints (including CQA for atomic ground queries) None of these works investigated range-CQAa to aggregate queries under of aggregate constraints. Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 35 / 34

  63. Backup Slides Appendix For Further Reading I Related Work Arenas, M., Bertossi, L.E., Chomicki, J.: Consistent query answers in inconsistent databases. In: Proc. 18 th ACM Symp. on Principles of Database Systems (PODS). (1999) 68–79 Arenas, M., Bertossi, L.E., Chomicki, J., He, X., Raghavan, V., Spinrad, J.: Scalar aggregation in inconsistent databases. Theor. Comput. Sci. (TCS) Vol. 3(296) (2003) 405–434 Fuxman, A., Fazli, E., Miller, R.J.: Conquer: Efficient management of inconsistent databases. In: Proc. ACM SIGMOD Int. Conf. on Management of Data (SIGMOD). (2005) 155–166 Flesca, S., Furfaro, F ., Parisi, F .: Querying and Repairing Inconsistent Numerical Databases. ACM Transactions on Database Systems (TODS), Vol 35 (2), 2010 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 36 / 34

  64. Backup Slides Appendix For Further Reading I Semantics of Aggregate Constraints An aggregate constraint is an aggregation expression that a database should satisfy The database D satisfies the aggregate constraint ⇒ � n � � ∀ � φ ( � i = 1 c i · χ i ( � y i ) ≤ K κ : x x ) = if, for all the substitutions of the variables in � x with constants making the conjunction of atoms on the LHS ( κ ) true , the inequality on the RHS ( κ ) holds on D . A database D is consistent w.r.t. a set of aggregate constraints AC if D | = AC Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 37 / 34

  65. Backup Slides Appendix For Further Reading I Semantics of Aggregate Constraints An aggregate constraint is an aggregation expression that a database should satisfy The database D satisfies the aggregate constraint ⇒ � n � � ∀ � φ ( � i = 1 c i · χ i ( � y i ) ≤ K κ : x x ) = if, for all the substitutions of the variables in � x with constants making the conjunction of atoms on the LHS ( κ ) true , the inequality on the RHS ( κ ) holds on D . A database D is consistent w.r.t. a set of aggregate constraints AC if D | = AC Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 37 / 34

  66. Backup Slides Appendix For Further Reading I Semantics of Aggregate Constraints An aggregate constraint is an aggregation expression that a database should satisfy The database D satisfies the aggregate constraint ⇒ � n � � ∀ � φ ( � i = 1 c i · χ i ( � y i ) ≤ K κ : x x ) = if, for all the substitutions of the variables in � x with constants making the conjunction of atoms on the LHS ( κ ) true , the inequality on the RHS ( κ ) holds on D . A database D is consistent w.r.t. a set of aggregate constraints AC if D | = AC Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 37 / 34

  67. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (1/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘net cash inflow’ ) − ( χ 1 ( x 1 , ‘total cash receipts’ ) − χ 1 ( x 1 , ‘total disbursements’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 38 / 34

  68. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (1/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘net cash inflow’ ) − ( χ 1 ( x 1 , ‘total cash receipts’ ) − χ 1 ( x 1 , ‘total disbursements’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 38 / 34

  69. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (1/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘net cash inflow’ ) − ( χ 1 ( x 1 , ‘total cash receipts’ ) − χ 1 ( x 1 , ‘total disbursements’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 38 / 34

  70. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (1/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘net cash inflow’ ) − ( χ 1 ( x 1 , ‘total cash receipts’ ) − χ 1 ( x 1 , ‘total disbursements’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 38 / 34

  71. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (2/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 2 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow . χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘ending cash balance’ ) − ( χ 1 ( x 1 , ‘ beginning cash’ ) + χ 1 ( x 1 , ’net cash inflow’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 39 / 34

  72. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (2/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 2 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow . χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘ending cash balance’ ) − ( χ 1 ( x 1 , ‘ beginning cash’ ) + χ 1 ( x 1 , ’net cash inflow’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 39 / 34

  73. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (2/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 2 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow . χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘ending cash balance’ ) − ( χ 1 ( x 1 , ‘ beginning cash’ ) + χ 1 ( x 1 , ’net cash inflow’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 39 / 34

  74. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (2/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 2 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow . χ 1 ( x , y ) = � BalanceSheets, Value, ( Year = x ∧ Subsection = y ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 1 ( x 1 , ‘ending cash balance’ ) − ( χ 1 ( x 1 , ‘ beginning cash’ ) + χ 1 ( x 1 , ’net cash inflow’ )) = 0 Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 39 / 34

  75. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (3/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 3 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 2 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 2 ( x 1 , x 2 , ‘det’ ) = χ 2 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 40 / 34

  76. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (3/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 3 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 2 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 2 ( x 1 , x 2 , ‘det’ ) = χ 2 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 40 / 34

  77. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (3/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 3 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 2 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 2 ( x 1 , x 2 , ‘det’ ) = χ 2 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 40 / 34

  78. Backup Slides Appendix For Further Reading I Example of Aggregate Constraint (3/3) BalanceSheets Year Section Subsection Type Value 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 2008 Receipts receivables det 120 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 3 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year χ 2 ( x , y , z ) = � BalanceSheets, Value, ( Year = x ∧ Section = y ∧ Type = z ) � BalanceSheets ( x 1 , x 2 , x 3 , x 4 , x 5 ) = ⇒ χ 2 ( x 1 , x 2 , ‘det’ ) = χ 2 ( x 1 , x 2 , ‘aggr’ ) Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 40 / 34

  79. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  80. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  81. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  82. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  83. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  84. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  85. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  86. Backup Slides Appendix For Further Reading I Two examples of card -minimal repairs Year Section Subsection Type Value ρ 1 ρ 2 2008 Receipts beginning cash drv 50 2008 Receipts cash sales det 100 − → 130 2008 Receipts receivables det 120 − → 150 2008 Receipts total cash receipts aggr 250 2008 Disbursements payment of accounts det 120 2008 Disbursements capital expenditure det 20 2008 Disbursements long-term financing det 80 2008 Disbursements total disbursements aggr 220 2008 Balance net cash inflow drv 30 2008 Balance ending cash balance drv 80 κ 1 for each section and year, the sum of the values of all detail items must be equal to the value of the aggregate item of the same section and year κ 2 for each year, the net cash inflow must be equal to the difference between total cash receipts and total disbursements κ 3 for each year, the ending cash balance must be equal to the sum of the beginning cash and the net cash inflow Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 41 / 34

  87. Backup Slides Appendix For Further Reading I Repairing non-numerical data (1/2) We assume that inconsistencies involve numerical attributes (measure attributes) only Non-measure attributes are assumed to be consistent In many real-life situations, even if integrity violations of measure data can coexist with integrity violations involving non-measure data, these inconsistencies can be fixed separately Sergio Flesca, Filippo Furfaro, Francesco Parisi Range-CQA under Aggregate Constraints 42 / 34

Recommend


More recommend