Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-3
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-4
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-5
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-6
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-7
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-8
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-9
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-10
Expected Ranks Example tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 world W Pr[ W ] { t 1 = 120 , t 2 = 103 , t 3 = 98 } 0 . 8 × 0 . 7 × 1 = 0 . 56 { t 1 = 120 , t 3 = 98 , t 2 = 70 } 0 . 8 × 0 . 3 × 1 = 0 . 24 { t 2 = 103 , t 3 = 98 , t 1 = 62 } 0 . 2 × 0 . 7 × 1 = 0 . 14 { t 3 = 98 , t 2 = 70 , t 1 = 62 } 0 . 2 × 0 . 3 × 1 = 0 . 06 tuple r(tuple) 0 . 56 × 0 + 0 . 24 × 0 + 0 . 14 × 2 + 0 . 06 × 2 = 0 . 4 t 1 0 . 56 × 1 + 0 . 24 × 2 + 0 . 14 × 0 + 0 . 06 × 1 = 1 . 1 t 2 0 . 56 × 2 + 0 . 24 × 1 + 0 . 14 × 1 + 0 . 06 × 0 = 1 . 5 t 3 8-11
Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) 9-1
Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) q ( v i,l ) is the sum of the probabilities that a tuple will out- rank a tuple with score v i,l 9-2
Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) X i may contain value-probability pairs ( v, p ) s.t. v > v i,l , since the existence of t i = v i,l precludes t i = v , we must subtract the corresponding p ’s from q ( v i,l ) 9-3
Expected Ranks It has been shown that r ( t i ) may be written as b i � p i,l ( q ( v i,l ) − Pr [ X i > v i,l ]) r ( t i ) = (2) l =1 where, = number of choices in the pdf of t i b i = probability of choice l in tuple t i p i,l � q ( v i,l ) = j Pr [ X j > v i,l ] = pdf of tuple t i X i Pr [ X i > v i,l ] = contribution of t i to q ( v i,l ) Efficient algorithms exist to compute the Expected ranks in O ( NlogN ) time for a database of N tuples 9-4
Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 10-1
Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 10-2
Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 r ( t 1 ) = 0 . 8 × 0 10-3
Computing Expected Ranks by q ( v ) ’s tuples score { (120 , 0 . 8) , (62 , 0 . 2) } t 1 { (103 , 0 . 7) , (70 , 0 . 3) } t 2 { (98 , 1) } t 3 3.0 2.8 2.5 1.5 0.8 0 - ∞ 62 70 98 103 120 r ( t 1 ) = 0 . 8 × 0 + 0 . 2 × (2 . 8 − 0 . 8) = 0 . 4 10-4
Distributed Probabilistic Data Model site 1 tuples score t 1 , 1 X 1 , 1 t 1 , 2 X 1 , 2 . . . . . . . . . site m tuples score t 2 , 1 X 2 , 1 t 2 , 2 X 2 , 2 . . . . . . 11-1
Distributed Probabilistic Data Model site 1 tuples score t 1 , 1 X 1 , 1 t 1 , 2 X 1 , 2 tuples . . . . . . t 1 . t 2 . . . . . site m tuples score t N t 2 , 1 X 2 , 1 Conceptual t 2 , 2 X 2 , 2 Database D . . . . . . We can think of the union of the individual databases D i at each site s i as a conceptual database D 11-2
Ranking Queries for Distributed Probabilistic Data We introduce two frameworks for ranking queries for distributed probabilistic data Sorted Access on Local Ranks Sorted Access on Expected Scores 12-1
Sorted Access on Local Ranks Framework site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Every site calculates the local ranks of its tuples and stores tuples in ascending order of local ranks 13-1
Sorted Access on Local Ranks Framework SERVER site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 The server accesses tuples in ascending order of local ranks and combines the local ranks to get the global ranks 13-2
Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 The global rank for a tuple t i,j is m � r ( t i,j , D y ) = r ( t i,j , Dy ) (5) y =1 14-1
Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 14-2
Local and Global Ranks The local rank of a tuple t i,j at a site s i in database D i is b i,j � p i,j,l ( q i ( v i,j,l ) − Pr [ X i,j > v i,j,l ]) r ( t i,j , D i ) = (3) l =0 The local rank for a tuple t i,j at a site s y with database D y , s.t. i � = y is b i,j � r ( t i,j , D y ) = p i,j,l ( q y ( v i,j,l )) (4) l =1 The global rank for a tuple t i,j is m � r ( t i,j , D y ) = r ( t i,j , Dy ) (5) y =1 14-3
Sorted Access on Local Ranks Initialization Rep. Queue tuple lrank 0.8 t 3 , 1 1.2 t 1 , 1 2.3 t 2 , 1 site 2 site 2 site 3 site 3 site 1 site 1 tuple tuple lrank lrank tuple tuple lrank lrank tuple tuple lrank lrank → → → 2.3 2.3 0.8 0.8 1.2 1.2 t 2 , 1 t 2 , 1 t 3 , 1 t 3 , 1 t 1 , 1 t 1 , 1 → → 3.4 3.4 4.1 4.1 → 5.9 5.9 t 2 , 2 t 2 , 2 t 3 , 2 t 3 , 2 t 1 , 2 t 1 , 2 . . . . . . . . . . . . . . . . . . 29.1 29.1 40.4 40.4 t 2 ,n 2 t 2 ,n 2 t 3 ,n 3 t 3 ,n 3 34.2 34.2 t 1 ,n 1 t 1 ,n 1 15-1
Sorted Access on Local Ranks Initialization Rep. Queue tuple lrank 0.8 t 3 , 1 1.2 t 1 , 1 2.3 t 2 , 1 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 → → 3.4 4.1 → 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 15-2
Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple grank 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-1
Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-2
Sorted Access on Local Ranks: a Round Rep. Queue top − 2 Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 3.4 t 2 , 2 5.4 t 2 , 1 4.1 t 3 , 2 7.9 t 1 , 1 5.9 t 1 , 2 tuple lrank 4.8 t 2 , 3 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-3
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-4
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 X 2 , 2 X 2 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-5
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-6
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple lrank tuple grank 3.4 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 grank 5.9 t 1 , 2 5.6 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-7
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple grank tuple lrank tuple grank 5.6 t 2 , 2 4.1 5.4 t 3 , 2 t 2 , 1 4.8 7.9 t 2 , 3 t 1 , 1 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-8
Sorted Access on Local Ranks: a Round top − 2 Queue Rep. Queue tuple lrank tuple grank 4.1 5.4 t 3 , 2 t 2 , 1 4.8 5.6 t 2 , 3 t 2 , 2 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-9
Sorted Access on Local Ranks: a Round We can safely terminate top − 2 Queue Rep. Queue whenever the largest grank tuple lrank tuple grank from top − k queue is ≤ 4.1 5.4 t 3 , 2 t 2 , 1 smallest lrank from Rep. 4.8 5.6 t 2 , 3 t 2 , 2 Queue 5.9 t 1 , 2 site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-10
Sorted Access on Local Ranks: a Round We can safely terminate top − 2 Queue Rep. Queue whenever the largest grank tuple lrank tuple grank from top − k queue is ≤ 4.1 5.4 t 3 , 2 t 2 , 1 smallest lrank from Rep. 4.8 5.6 t 2 , 3 t 2 , 2 Queue 5.9 t 1 , 2 A-LR site 2 site 3 site 1 tuple lrank tuple lrank tuple lrank 2.3 0.8 1.2 t 2 , 1 t 3 , 1 t 1 , 1 3.4 4.1 5.9 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 29.1 40.4 t 2 ,n 2 t 3 ,n 3 34.2 t 1 ,n 1 16-11
Sorted Access on Expected Scores Framework site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Every site calculates the local ranks and the expected scores of its tuples and stores the tuples in descending order of expected scores 17-1
Sorted Access on Expected Scores Framework SERVER site 2 site m site 1 t 2 , 1 t 1 , 1 t m, 1 . . . t 2 , 2 t m, 2 t 1 , 2 . . . . . . . . . t 2 ,n 2 t 2 ,n m t 1 ,n 1 Tuples are accessed by descending order of expected scores and the server calculates global ranks 17-2
Sorted Access on Expected Scores Initialization site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] → → → 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 18-1
Sorted Access on Expected Scores Initialization Rep. Queue tuple E [ X ] 500 t 3 , 1 489 t 1 , 1 476 t 2 , 1 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 → → 464 432 → 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . . . . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 18-2
Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple E [ X ] tuple grank 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-1
Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-2
Sorted Access on Expected Scores: a Round Rep. Queue top − 2 Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 464 t 2 , 2 5.4 t 2 , 1 432 t 3 , 2 7.9 t 1 , 1 421 t 1 , 2 tuple E [ X ] 429 t 2 , 3 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-3
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-4
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 X 2 , 2 X 2 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-5
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-6
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple lrank tuple E [ X ] tuple grank 3.4 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 grank 421 t 1 , 2 5.6 lrank lrank 0.7 1.5 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-7
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple grank tuple E [ X ] tuple grank 5.6 t 2 , 2 432 5.4 t 3 , 2 t 2 , 1 429 7.9 t 2 , 3 t 1 , 1 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-8
Sorted Access on Expected Scores: a Round top − 2 Queue Rep. Queue tuple E [ X ] tuple grank 432 5.4 t 3 , 2 t 2 , 1 429 5.6 t 2 , 3 t 2 , 2 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-9
Sorted Access on Expected Scores: a Round Now the only question is top − 2 Queue Rep. Queue when may we safely termi- tuple E [ X ] tuple grank nate and be certain we have 432 5.4 t 3 , 2 t 2 , 1 the global top − k 429 5.6 t 2 , 3 t 2 , 2 421 t 1 , 2 site 2 site 3 site 1 tuple E [ X ] tuple E [ X ] tuple E [ X ] 476 500 489 t 2 , 1 t 3 , 1 t 1 , 1 464 432 421 t 2 , 2 t 3 , 2 t 1 , 2 . . . . . . → . → . → . 11 1 t 2 ,n 2 t 3 ,n 3 5 t 1 ,n 1 19-10
Sorted Access on Expected Scores: Termina- tion The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ 20-1
Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ 20-2
Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ The head from the Representative queue with expectance τ is an upper bound for the expectance of any unseen t s.t. E [ X ] ≤ τ 20-3
Sorted Access on Expected Scores: Termina- tion top − 2 Queue Rep. Queue tuple grank tuple E [ X ] 5.4 432 t 2 , 1 t 3 , 2 5.6 429 t 2 , 2 t 2 , 3 421 t 1 , 2 The largest element from the top − k queue is clearly an upper bound r + λ for the global rank of any seen tuple t with pdf X to be in the top − k at round λ The head from the Representative queue with expectance τ is an upper bound for the expectance of any unseen t s.t. E [ X ] ≤ τ How can we derive a lower bound r − λ for the global rank of any unseen tuple t s.t. when r + λ ≤ r − λ it is safe to terminate at round λ ? 20-4
Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ 21-1
Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ Markov Inequality 21-2
Sorted Access on Expected Scores: a Lower Bound? We introduce two methods to find a lower bound r − λ for any unseen tuple t at round λ Markov Inequality Linear Programming 21-3
Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ 22-1
Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 22-2
Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 Now the global rank r ( t ) must satisfy m � r ( t ) ≥ r − ( t, D i ) = r − (7) λ i =1 22-3
Markov Inequality Lower Bound We know that the pdf of any unseen t must satisfy E [ X ] ≤ τ We can use the Markov Inequality to lower bound the rank of any site s i with database D i as, n i n i � � r ( t, D i ) = Pr[ X j > X ] = n i − Pr[ X ≥ X j ] j =1 j =1 b ij n i p i,j,ℓ E [ X ] � � Loose! (Markov Ineq.) n i − v i,j,ℓ . ≥ j =1 ℓ =1 b ij n i τ � � v i,j,ℓ = r − ( t, D i ) . (6) n i − p i,j,ℓ ≥ j =1 ℓ =1 Now the global rank r ( t ) must satisfy m � r ( t ) ≥ r − ( t, D i ) = r − (7) λ i =1 22-4
Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ 23-1
Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality 23-2
Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality We want to find as tight a r − λ as possible by finding the small- est possible r − ( t, D i ) ’s at each site 23-3
Linear Programming Lower Bound Any unseen tuple t must have E [ X ] ≤ τ We’ve seen how to derive a lower bound r − λ on the global rank for any unseen tuple t using Markov’s Inequality We want to find as tight a r − λ as possible by finding the small- est possible r − ( t, D i ) ’s at each site We can use Linear Programming in order to derive the r − ( t, D i ) at each site to find a tight r − λ 23-4
Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i 24-1
Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i X could take on arbitrary v ℓ ’s as it’s possible score values, some of which do not exist in value universe U i at a site s i 24-2
Linear Programming The idea is to construct the best possible X for an unseen tuple t at each site s i that obtains the smallest possible local rank for each s i X could take on arbitrary v ℓ ’s as it’s possible score values, some of which do not exist in value universe U i at a site s i We can show this problem is irrelevant after studying the se- mantics of the r ( t, D i ) ’s and the q ( v ) ’s 24-3
Linear Programming: a Note on q ( v ) ’s Recall that r ( t i,j , D y ) = � b i,j ℓ =1 p i,j,l q y ( v i,j,l ) and q ( v ) is essentially a stair case curve as above 25-1
Linear Programming X may take a value v ℓ not in U i with v 2 as its nearest left neighbor 25-2
Linear Programming Even if X takes a value v ℓ not in U i we can decrease v ℓ until we hit v 2 in U i and E [ X ] ≤ τ clearly still holds as we are only decreasing the value of one of the choices in X 25-3
Linear Programming Also note that during this transformation q ( v ℓ ) = q ( v 2 ) and so the local rank of t remains the same 25-4
Linear Programming Formulation Now we can assume X draws values from U i 26-1
Linear Programming Formulation Now we can assume X draws values from U i Then we can define a linear program with the constraints 0 ≤ p ℓ ≤ 1 ℓ = 1 , . . . , γ = | U i | p 1 + . . . + p γ = 1 p 1 v 1 + . . . + p γ v γ ≤ τ and minimize the local rank which is, r ( X, D i ) = � γ ℓ =1 p ℓ q i ( v ℓ ) 26-2
Recommend
More recommend