elements of a nonstochastic information theory
play

Elements of a Nonstochastic Information Theory Girish Nair Dept. - PowerPoint PPT Presentation

Elements of a Nonstochastic Information Theory Girish Nair Dept. Electrical & Electronic Engineering University of Melbourne LCCC Workshop on Information and Control in Networks Lund, Sweden 17 October 2012 Random Variables in


  1. Elements of a Nonstochastic Information Theory Girish Nair Dept. Electrical & Electronic Engineering University of Melbourne LCCC Workshop on Information and Control in Networks Lund, Sweden 17 October 2012

  2. Random Variables in Communications In communications, unknown quantities/signals are usually modelled as random variables (rv’s) & random processes , for good reasons: Physical laws governing electronic/photonic circuit noise give rise � to well-defined distributions & random models – e.g. Gaussian thermal electronic noise, binary symmetric channels, Rayleigh thermal electronic noise, binary symmetric channels, Rayleigh fading, etc. fading, etc. Telecomm. systems usually designed to be used many times, & � each individual phone call/email/download may not be critically important... � System designer need only seek good performance in an average or expected sense - e.g. bit error rate, signal-to-noise ratio, outage probability. 2

  3. Nonrandom Variables in Control In contrast, unknowns in control are often treated as non stochastic variables or signals Dominant disturbances are not necessarily � electronic/photonic circuit noise, & may not follow well-defined probability distributions. Safety- & mission-criticality � � Performance guarantees needed every time plant is used, not just on average.

  4. Networked Control Networked control: combines both communications and control theories! � How may nonstochastic analogues of key probabilistic concepts like independence, Markovness and information be usefully defined?

  5. Another Motivation: Channel Capacity The ordinary capacity C of a channel is defined as the highest block-code bit-rate that permits an arbitrarily small probability of decoding error. log F log | F | (subaddi tivity) 2 t 2 t = = I.e. C : lim supsup lim lim sup , + + t 1 t 1 ε → 0 ε → 0 t →∞ t ≥ 0 where := a finite set of input words of length F t + 1, t & the inner supremums are over all s.t. F ∀ x (0 : ) t ∈ F , t t the corresponding random channel output word (0 : ) Y t   ˆ ˆ can be mapped to an estimate (0: ) with Pr X t X (0 : ) t ≠ x (0 : ) t ≤ ε .   5

  6. Information Capacity Shannon's Channel Coding Theorem essentially gives an information-theoretic characterization of C for stationary memoryless stochastic channels : [ ] [ ] I X (0: ); (0: ) t Y t I X (0: ); (0: ) t Y t C = supsup = lim sup t + 1 t + 1 t →∞ t ≥ 0 ( ) = supI[ (0); (0)] , X Y where I[ ; ]:=Shannon's ⋅⋅ mutual information functional, and the inner supremums are over all random input sequences (0: ). X t 6

  7. Zero-Error Capacity In 1956, Shannon also introduced the stricter notion of zero error capacity C - , the highest block-coded bit-rate 0 that permits a probability of decoding error = 0 exactly. log F log | log | F | | t = 2 2 t t 2 2 t C = = = I.e. I.e. C : supsu : supsu p p lim sup lim sup , , 0 t + 1 t + 1 t →∞ t ≥ 0 where = a finite set of input words of length F t + 1, t & the inner supremums are over all s.t. F ∀ x (0 : ) t ∈ F , t t the corresponding channel output word (0 : ) Y t   ˆ ˆ can be mapped to an estimate (0 : ) with Pr X t X (0 : ) t ≠ x (0 : ) t = 0.   Clearly, C is (usually strictly) smaller than . C 7 0

  8. C0 as an “Information” Capacity? Fact: C0 does not depend on the nonzero transition probabilities of the channel, and can be defined without any probability theory, in terms of the input-output graph that describes permitted channel transitions. � Q: Can we express C0 as the maximum rate of some nonstochastic information functional? 8

  9. Outline � (Motivation) � Uncertain Variables � Taxicab Partitions & Maximin Information � Taxicab Partitions & Maximin Information � C0 via Maximin Information � Uniform LTI State Estimation over Erroneous Channels � Conclusion � Extension & Future Work 9

  10. The Uncertain Variable Framework Similar to probability theory, let an uncertain variable (uv) be a mapping X � from some sample space Ω to a space X . E.g., each ω є Ω may represent a particular combination of disturbances & � inputs entering a system, & X may represent an output/state variable For any particular ω , the value x=X( ω ) is realised. � X Ω Ω Ω Ω ω x Unlike prob. theory, assume no σ -algebra or measure on Ω . 10

  11. � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Ranges Ranges As in prob. theory, the As in prob theory the -argument will often be omitted argument will often be omitted. Marginal range : ( ) : . X X X Joint range Joint range , : : ( ), ( ) : ( ) ( ) : . X Y X Y X X Y Y X X Y Y Conditional range | : ( ) : ( ) , . X y X Y y X In the absence of statistical structure, the joint range completely characterises the relationship between uv's & . X Y As , | { }, X Y X y y y y Y the joint range can be determine d from the conditional & marginal ranges, similar to the relationship between joint, conditional & marginal probability distributions. 12 12

  12. � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Unrelatedness X Y , called unrelated if X Y , X Y , or equivalently if or equivalently if X y | X , y Y . Parallels the definition of mutual independence for rv's. Called related if X , , Y X Y � � � � , without equality. � � � � , q y 13

  13. � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � y y Y x | ' Y X Y , X Y , Y Y y’ Y x | ' y y’ X y | ' X x x’ x x’ X X X X X y X | ' | ' a) X,Y related b) X,Y unrelated

  14. � � � � � � � � � � �� � � � � � � � � Nonstochastic Entropy The a priori uncertainty associated with a uv is captured by X Hartley entropy Hartley entropy H [ ]: log H [ ]: log X X X X [0 [0, ]. ] 0 2 Continuous-valued uv's yield H [ ] X � � . 0 n n For uv's with Lebesgue measurable range in For uv's with Lebesgue-measurable range in , the 0- th order Re nyi differential entropy h [ ]: log X X [ , ] 0 2 is more useful is more useful. 18

  15. � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � Nonstochastic Information – Nonstochastic Information – Previous Definitions H. Shingin & Y. Ohta, NecSys09: X X inf log , X discrete-valued 2 X y | y Y I [ ; ]: [ X Y ] . 0 0 � � X inf log , X continuous-valued 2 X y | y Y (expressed in the uv framework here) G. Klir, 2006: H X H Y H X Y , , X Y , finite-valued 0 0 0 T[ ; ]: T [ ; ]:= X Y X Y . . n Something complex, ( , ) cont.-valued w. convex range X Y 19

  16. Comments on Comments on Previous Definitions � Each gives different treatments of continuous & di discrete-valued variables. t l d i bl � Klir’s information has natural properties, but is purely axiomatic. No demonstrated relevance to problems in communications or control. � Shingin & Ohta’s information: inherently asymmetric, but shown to be useful for studying y , y g control over errorless digital channels. 20

  17. � � � � � � � � � � � � � � � � � � � � Taxicab Connectivity A pair of points ( , ), ( ', ') x y x y X Y , is called taxicab connected, n denoted ( , ) denoted ( x y x y ) ( ' ( , x y x y ') if a finite sequence ( ), if a finite sequence ( , x y x y ) ) in in X Y X Y , i i i 1 i) beginning from ( , x y ) ( , ), x y 1 1 ii) ending in ( ii) ending in ( x y x y , ) ) ( , ( ' x y x y ') ), n n iii) and w ith each point in the sequence differing in at most one coordinate from its predecessor. from its predecessor Every point in this sequence must yield the Every point in this sequence must yield the same z same z value -value as its predecessor, since it has either the same - o x r -coordinate. y By induction, ( , )& ( , By induction ( x y x y )& ( ' x y x y ') yield the same -value ) yield the same -value. z z 22

  18. � � � Taxicab Connectedness Taxicab Connectedness Examples Examples ([[ X,Y ]] = shaded area) y y y x x x ( , ) x y ( ', '), x y also disconnected in usual sense l di t d i l ( , ) x y ( ', '), x y ( , ) x y ( ', '), x y but disconnected in usual sense. but connected in usual sense. 23 23 23

Recommend


More recommend