 
              Coalgebraic Tools for Randomness-Conserving Protocols Matvey Soloviev (Cornell University) RAMiCS 2018, Groningen joint work with Dexter Kozen 1
This Talk A coalgebraic model for constructing and reasoning about state-based protocols that implement effjcient reductions among random processes (effjcient = conserve randomness) Basic tools that allow effjcient protocols to be constructed compositionally Tradeofgs between latency and effjciency Several examples of effjcient reductions Toward a general coalgebraic semantics of reductions 2
Randomness as a Computational Resource Randomness is a resource to be conserved Information and coding [Shannon] Probabilistic complexity and derandomization [Luby] Pseudo-random number generation [Yao, Nisan, Wigderson] Extracting strong randomness from weak sources [von Neumann, Elias, Blum] A recent application: Routing in networks Randomized routing, gossip protocols, load balancing Desirable to minimize local state to achieve high throughput 3
Measuring Randomness Discrete reduction protocol A procedure that maps an input stream to an output stream discrete reduction protocol If the input sequence comes from a random process, then the statistical properties of the input stream impart statistical properties to the output stream We can think of the process as a reduction between random sources But randomness can be lost … 4 abcbbbcabbaccbacababbcba 011000010110010101111010
Shannon Entropy Usually described as a measure of uncertainty or information content Represents an absolute limit on lossless compression (Shannon source coding theorem, 1948) = the number of fair coin fmips is worth 5 Entropy of a discrete distribution µ = p 1 , . . . , p n H ( µ ) = − ∑ p i log p i i
Shannon Entropy Usually described as a measure of uncertainty or information content Represents an absolute limit on lossless compression (Shannon source coding theorem, 1948) 5 Entropy of a discrete distribution µ = p 1 , . . . , p n H ( µ ) = − ∑ p i log p i i H ( µ ) = the number of fair coin fmips µ is worth
Entropy as a Measure of Randomness stream of digits achievable asymptotically (requires unbounded latency) protocol reduction discrete achievable asymptotically (requires unbounded latency) stream of digits protocol reduction discrete 6 The entropy of µ is the number of fair coin fmips it is worth 0110010101111010 distributed as µ 1/ H ( µ ) is an upper bound on the rate of production 0110010101111010 distributed as µ H ( µ ) is an upper bound on the rate of production
Effjciency of a Simulation discrete reduction protocol 7 stream of digits over Σ stream of digits over Γ distributed as µ distributed as ν Effjciency = E prod · H ( ν ) E cons · H ( µ ) ≤ 1 E cons = expected number of digits consumed E prod = expected number of digits produced H ( µ ) = entropy of input distribution H ( ν ) = entropy of output distribution
Effjciency of a Simulation discrete reduction protocol Measures the amount of randomness lost in the conversion May vary with time Cannot exceed unity [Shannon] Unity is achievable asymptotically [Elias, Cover & Thomas]; requires unbounded latency 8 stream of digits over Σ stream of digits over Γ distributed as µ distributed as ν Effjciency = E prod · H ( ν ) E cons · H ( µ ) ≤ 1
Sometimes Perfect Effjciency is Achievable 9 H : 1 T : 1 2 2 a d : 1 a : 1 H : 1 T : 1 8 2 2 2 b : 1 c : 1 4 8 b H : 1 T : 1 2 2 0 10 110 111 c d H ( 1 2, 1 4, 1 8, 1 H ( 1 2, 1 8 ) = 7/4 2 ) = 1
The von Neumann Trick [1951] To simulate a fair coin with a Shannon says Oblivious to the bias of the input coin, but effjciency is poor: 10 bias- p coin: p 1 − p fmip the bias- p coin twice p 1 − p 01 ⇒ H p 1 − p 10 ⇒ T H T 00 or 11 ⇒ fmip again for p = 1/3 , E cons / E prod = 4.5 1/ ( − ( 1/3 ) log ( 1/3 ) − ( 2/3 ) log ( 2/3 )) ≈ 1.083 · · ·
A More Effjcient Protocol This is optimal for single-digit-output protocols 11 1 2 3 3 H 1 2 3 3 1 3 T 2 3 H Not oblivious to the bias p = 1/3 , but effjciency is better: E cons / E prod = 2
A: No! 12 , but Q: Is this optimal? 2 , 1 Other Direction ( 1 2 ⇒ 1 3 , 2 3 ) 1 1 2 2 1 2 T 1 2 H Pr ( H ) = 1 4 + 1 16 + 1 64 + · · · = 1 3 Pr ( T ) = 1 2 + 1 8 + 1 32 + · · · = 2 3
12 Q: Is this optimal? 2 , 1 Other Direction ( 1 2 ⇒ 1 3 , 2 3 ) 1 1 2 2 1 2 T 1 2 H Pr ( H ) = 1 4 + 1 16 + 1 64 + · · · = 1 3 Pr ( T ) = 1 2 + 1 8 + 1 32 + · · · = 2 3 A: No! E cons / E prod = 2 , but − 1 3 log 1 3 − 2 3 log 2 3 ≈ .92 · · ·
A: No, but better! Q: Is this optimal? How to Do Better? 13 H H H H H H H H H H T T T T T T T T T T T T T T T T T T T T H H H T T T T T T H H H H H H T T T T T T T T T T T T H T T H H T T T T H H T T T T H H H H T T T T T T T T
Q: Is this optimal? How to Do Better? 13 H H H H H H H H H H T T T T T T T T T T T T T T T T T T T T H H H T T T T T T H H H H H H T T T T T T T T T T T T H T T H H T T T T H H T T T T H H H H T T T T T T T T A: No, but better! E cons / E prod = 5/2.625 = 1.905
Latency This protocol has many more states, and we’ll have to read in at least 4 symbols before we output anything. Defjne: at least one output symbol. probability (sub)space = more leeway to carve it up into “correctly This tradeofg is inevitable whenever no perfect protocol exists. 14 Latency = expected consumption before producing Generally, higher latency = longer input = higher-grained sized” chunks for better effjciency.
Asymptotic optimality is not everything It’s known that asymptotically optimal families of reductions exist. even worse. 15 Now we can say that some are better than others: it matters whether effjciency 1 − ε would require latency O ( 1/ ε ) , O ( 1/ ε 2 ) or
16 Notation Σ , Γ fjnite alphabets Σ ∗ = fjnite words over Σ x , y , . . . ∈ Σ ∗ , Γ ∗ Σ ω = ω -words (streams) over Σ α , β , . . . ∈ Σ ω , Γ ω ⪯ prefjx, ≺ proper prefjx µ is a probability measure on Σ , endow Σ ω with the product measure – each symbol independent and distributed as µ The measurable sets of Σ ω are the Borel sets of the Cantor space topology whose basic open sets are the intervals { α ∈ Σ ω | x ≺ α } for x ∈ Σ ∗ µ ( a 1 a 2 · · · a n ) = µ ( a 1 ) µ ( a 2 ) · · · µ ( a n ) µ ( { α ∈ Σ ω | x ≺ α } ) = µ ( x )
Protocols of Mealy automaton) It follows that 17 A protocol is a coalgebra ( S , δ ) where δ : S × Σ → S × Γ ∗ (a form Extend δ to domain S × Σ ∗ by coinduction: δ ( s , ε ) = ( s , ε ) δ ( s , ax ) = let ( t , y ) = δ ( s , a ) in let ( u , z ) = δ ( t , x ) in ( u , yz ) δ ( s , xy ) = let ( t , z ) = δ ( s , x ) in let ( u , w ) = δ ( t , y ) in ( u , zw )
Extension to Streams coinduction: It follows that except in the degenerate case in which only fjnitely many output letters are ever produced streams) if, starting in any state, an output symbol is produced within fjnite expected time (therefore w.p. 1) 18 A protocol δ also induces a partial map δ ω : S × Σ ω ⇀ Γ ω by δ ω ( s , a α ) = let ( t , z ) = δ ( s , a ) in z · δ ω ( t , α ) δ ω ( s , x α ) = let ( t , z ) = δ ( s , x ) in z · δ ω ( t , α ) Given α ∈ Σ ω , this defjnes a unique infjnite string in δ ω ( s , α ) ∈ Γ ω A protocol is productive (wrt a given probability measure on input
Reductions 19 Let ν be a probability measure on Γ , ν ( a 1 · · · a n ) = ν ( a 1 ) · · · ν ( a n ) ( S , δ , s ) with start state s ∈ S is a reduction from µ to ν if ∀ y ∈ Γ ∗ µ ( { α | y ⪯ δ ω ( s , α ) } ) = ν ( y ) This implies that the symbols of δ ω ( s , α ) are independent and distrinbuted as ν
Advantages of the Coalgebraic View Many constructions in the information theory literature are This class admits a fjnal coalgebra 20 expressed in terms of trees – but Protocols are coalgebras δ : S × Σ → S × Γ ∗ , a form of Mealy automata, i.e. not trees D : ( Γ ∗ ) Σ + × Σ → ( Γ ∗ ) Σ + × Γ ∗ , where f @ a ( x ) = f ( ax ) , a ∈ Σ , x ∈ Σ + D ( f , a ) = ( f @ a , f ( a )) Extension to streams D ω : ( Γ ∗ ) Σ + × Σ ω ⇀ Γ ω D ω ( f , a α ) = f ( a ) · D ω ( f @ a , α )
Advantages of the Coalgebraic View thereby providing a mechanism for transferring results on trees to state transition systems 21 A state f : Σ + → Γ ∗ can be viewed as a labeled tree with nodes Σ ∗ and edge labels Γ ∗ The nodes xa are the children of x for x ∈ Σ ∗ and a ∈ Σ The label on the edge ( x , xa ) is f ( xa ) The tree f @ x is the subtree rooted at x ∈ Σ ∗ , where f @ x ( y ) = f ( xy ) For any coalgebra ( S , δ ) , there is a unique coalgebra morphism h : ( S , δ ) → (( Γ ∗ ) Σ + , D ) defjned coinductively by ( h ( s ) @ a , h ( s )( a )) = let ( t , z ) = δ ( s , a ) in ( h ( t ) , z ) Protocols can inherit structure from the fjnal coalgebra under h − 1 ,
Restart Protocols w.p. 1 22 A prefjx code is a subset A ⊆ Σ ∗ such that every element of Σ ω has at most one prefjx in A The elements of a prefjx code are ⪯ -incomparable A prefjx code is exhaustive (wrt µ ) if α ∈ Σ ω has a prefjx in A A restart protocol ( S , δ , s ) is determined by a function f : A ⇀ Γ ∗ , where A is an exhaustive prefjx code Intuitively, starting in s , read symbols of Σ from the input stream until encountering a string x ∈ A , output f ( x ) , repeat
Recommend
More recommend